Complexity nomenclature
Ola Fosheim Grøstad via Digitalmars-d
digitalmars-d at puremagic.com
Fri Dec 4 08:39:30 PST 2015
On Friday, 4 December 2015 at 15:33:56 UTC, Jonathan M Davis
wrote:
> Only when dealing with an arbitrary number of elements. O(n^2)
> could be fine if you know for a fact that you're always dealing
> with a very small number of elements.
I think there was a misunderstanding about notation here. If we
know that we use a small number of elements, then all operations
are O(1) by definition, for any small constant.
> different algorithms perform there. O(1) is definitely better,
> but it's not necessarily required.
Well, O(1) isn't better, it is just mandatory for anything real
time. E.g: you fix your buffer to 1 seconds of audio and that is
44100 samples, then filling it is O(44100) == O(1).
If am doing a scanline renderer for 2048 pixels, then an
insertion sort is faster than merge sort, because most
linesegments only move a few pixels per scanline. But here the
problem is that there might be a 0.1% chance that most lines are
almost horizontal in a variety of angles and then it breaks down,
so that will be very limiting, but I still can figure out the
hard maximum time required to complete, so it is still O(1) and
tractable as a real time operation.
If I accept arbitrary input, then I no longer can meet real time
requirements, I cannot get to O(1) and therefore cannot put a
fixed limit on maximum time required to complete.
And that is really the only thing O(1) says: you can put a fixed
limit on the time needed to complete the operation at compile
time.
> actually involved. With large numbers of elements, Big-Oh
> starts mattering, but if the number of elements isn't large,
> then coefficients and minor implementation details of an
> algorithm start mattering a lot more than its conceptual worst
> case.
Yes, but most collections of entities are not very large by
nature. Usually, when you try to solve a problem faster you break
it down by useful segmentation and clusters (like age, kind,
proximity).
Complexity is mostly for thinking about what options you have
when designing an algorithm. The cookbook/library approach to
complexity is rather limited since they were not written for the
specifics of the problem.
> Still, in general, it's better to be aware of algorithmic
> complexity and favor algorithms with lower complexity until you
> need to optimize the code based on your specific use case and
> determine that a higher complexity algorithm actually performs
> better in that specific use case.
Be aware of it, yes. Although in real code it is often best to
just start out with something flexible like a dynamic array, and
optimize when you know which operations are required and what
operations you never will need. That often changes over time.
I find that I am able to solve most of my real time problems with
arrays. When I need hard requirements, I often end up using very
simple datastructures like fixed sized circular buffers with a
carefully calculated headroom (number of mostly unused elements)
or linked lists of chunks or something really easy to reason
about.
During initialisation of the program it often does not matter. So
if I use linear search and it completes in 100ms then fine. I
don't care if there is a delay of 100ms at startup.
For batch things done in Python, I just use whatever is easy,
with current machines most things complete in less than a few
seconds anyway.
Making libraries easy to use is by far the most important
parameter, I think. What would be cool is if the compiler could
adjust the implementation of collections based on profiling!
More information about the Digitalmars-d
mailing list