"in" everywhere

Fri Oct 8 03:06:01 PDT 2010

Jonathan M Davis wrote:
> On Thursday, October 07, 2010 14:44:50 Stanislav Blinov wrote:

>> But that is not a matter of library interface isn't it? It's a matter of
>> algorithm/container choice. It's not the push_back that was slow in the
>> end, it was std::vector (yes, that's arguable, but the point is that
>> container defines rules for its methods, not vice-versa).
> 
> Except that when you're dealing with generic code which has to deal with 
> multiple container types (like std.algorithm), you _need_ certain complexity 
> guarantees about an operation since it could happen on any container that it's 
> given. Using operator in in an algorithm could be perfectly reasonable if it had 
> O(1) or O(log n) complexity but be completely unreasonable if it had O(n) 
> complexity. So, the algorithm then is reasonable with some containers and not 
> others when if in were restricted to O(1) or O(log n), it could be used by the 
> algorithm without worrying that a container would be used with it which would 
> make it an order of complexity greater, or worse.
> 
> If it were strictly a matter of writing code which directly used a particular 
> algorithm and container type, then you could know what the complexities of the 
> algorithm and operations on the container are, but once you're dealing with 
> generic code, operations need to have complexity guarantees regardless of their 
> container, or the complexity of the algorithm will vary with the container type. 
> And if that algorithm is used in yet another algorithm, then it gets that much 
> worse. It can quickly become easy to bury the place where what you're trying to 
> do becomes an order of complexity greater, because the container that you 
> selected was an order of complexity greater than other containers on an 
> operation that an algorithm is using buried in code somewhere.
> 
> - Jonathan M Davis

Yet still, generality ends at some point. You can't devise every 
possible algorithm for any possible types and have it have set-in-stone 
complexity independently of types.

Take std.range.popFrontN(). It's generic, and it's used in other 
algorithms. Yet it has O(1) complexity for ranges that support slicing, 
and O(n) for those that do not. But you don't take into account 
complexity of slicing operation, or complexity of stepping through the 
range.

Or std.algorithm.find(). It's basically O(n), but then again, when using 
it, one should also consider complexity of used predicate.
Just the same happened in the case Andrei described: the algorithm was 
O(n) judging from the description (performing n insertions into 
container). But the container itself blew it up into quadratic time 
because of it's own insertion algorithm.

What I mean is you'll always have algorithms that will perform 
differently for different containers, and you'll always have to choose 
containers that best fit your needs. Generic code is not only about 
efficiency: you'd at least expect it to work for different types.
Replacing vector with list in Andrei's case would probably solve the 
problem at the cost of losing random access (together with contiguous 
storage). Which means it'd also require changing code if random access 
was in use somewhere. Having a generic random access function 
at(Container,index) (offering complexity that varies with container 
used: e.g. O(1) for arrays, O(n) for lists) would save you maintenance.