Complexity nomenclature

Ola Fosheim Grøstad via Digitalmars-d digitalmars-d at puremagic.com
Fri Dec 4 08:39:30 PST 2015


On Friday, 4 December 2015 at 15:33:56 UTC, Jonathan M Davis 
wrote:
> Only when dealing with an arbitrary number of elements. O(n^2) 
> could be fine if you know for a fact that you're always dealing 
> with a very small number of elements.

I think there was a misunderstanding about notation here. If we 
know that we use a small number of elements, then all operations 
are O(1) by definition, for any small constant.

> different algorithms perform there. O(1) is definitely better, 
> but it's not necessarily required.

Well, O(1) isn't better, it is just mandatory for anything real 
time. E.g: you fix your buffer to 1 seconds of audio and that is 
44100 samples, then filling it is O(44100) == O(1).

If am doing a scanline renderer for 2048 pixels, then an 
insertion sort is faster than merge sort, because most 
linesegments only move a few pixels per scanline. But here the 
problem is that there might be a 0.1% chance that most lines are 
almost horizontal in a variety of angles and then it breaks down, 
so that will be very limiting, but I still can figure out the 
hard maximum time required to complete, so it is still O(1) and 
tractable as a real time operation.

If I accept arbitrary input, then I no longer can meet real time 
requirements, I cannot get to O(1) and therefore cannot put a 
fixed limit on maximum time required to complete.

And that is really the only thing O(1) says: you can put a fixed 
limit on the time needed to complete the operation at compile 
time.

> actually involved. With large numbers of elements, Big-Oh 
> starts mattering, but if the number of elements isn't large, 
> then coefficients and minor implementation details of an 
> algorithm start mattering a lot more than its conceptual worst 
> case.

Yes, but most collections of entities are not very large by 
nature. Usually, when you try to solve a problem faster you break 
it down by useful segmentation and clusters (like age, kind, 
proximity).

Complexity is mostly for thinking about what options you have 
when designing an algorithm. The cookbook/library approach to 
complexity is rather limited since they were not written for the 
specifics of the problem.

> Still, in general, it's better to be aware of algorithmic 
> complexity and favor algorithms with lower complexity until you 
> need to optimize the code based on your specific use case and 
> determine that a higher complexity algorithm actually performs 
> better in that specific use case.

Be aware of it, yes. Although in real code it is often best to 
just start out with something flexible like a dynamic array, and 
optimize when you know which operations are required and what 
operations you never will need. That often changes over time.

I find that I am able to solve most of my real time problems with 
arrays. When I need hard requirements, I often end up using very 
simple datastructures like fixed sized circular buffers with a 
carefully calculated headroom (number of mostly unused elements) 
or linked lists of chunks or something really easy to reason 
about.

During initialisation of the program it often does not matter. So 
if I use linear search and it completes in 100ms then fine. I 
don't care if there is a delay of 100ms at startup.

For batch things done in Python, I just use whatever is easy, 
with current machines most things complete in less than a few 
seconds anyway.

Making libraries easy to use is by far the most important 
parameter, I think. What would be cool is if the compiler could 
adjust the implementation of collections based on profiling!



More information about the Digitalmars-d mailing list