Compiler hints, inlining and syntax consistency

Sat Dec 28 10:18:55 PST 2013

On Saturday, 28 December 2013 at 17:23:38 UTC, Ola Fosheim 
Grøstad wrote:
> STL is not great, has never been great, and probably will never 
> be great. Cool idea, bad execution for the language it was 
> implemented in. So they had to change C++ to support i better! 
> :-P
>
> How much time does a committee need to agree on a hash-table 
> implementation? Or should I say decades? STL is so 
> designed-by-a-committee that it hurts. It is not a shining 
> star! Not at all.
>
> STL is ok for early prototyping with C++11, but if you care 
> about performance you roll your own.

Nice tirade, but I only referenced STL for its take on 
algorithmic complexity requirements. If it makes the concept more 
palatable, you could instead imagine it as the anti-thesis of 
Java's collection interfaces.

> Single linked lists is very useful for situations where you 
> want lists that are mostly empty or short, but where you do not 
> want to impose an upperbound. O(N) on length() does not matter 
> much in many real-world scenarios.

In these cases you can use `walkLength` which is in itself the 
very warning you want.

> What you want is a (mild) warning if you use a templated 
> algorithm and that algorithm use a O(N) primitive where it is 
> invisible to the library user. A warning you should be able to 
> suppress (basically telling the compiler "trust me, this is 
> deliberate").

You could define a wrapper type to explicitly attach O(n) 
primitives onto ranges and/or containers, but generally the 
response is that you're using the wrong container, so this 
probably won't find its way to Phobos.

However, it should be noted that D supports flexible 
metaprogramming which has resulted in algorithms with looser 
requirements, such as Andrei's 
`std.algorithm.levenshteinDistance` which doesn't require random 
access, circumventing the problem entirely.

> Why is that?  If your non-associative array on average is of 
> length 3-4 then it might outperform an associative array.

In terms of advantages, it is only a minor syntactical benefit 
over using std.algorithm.among/canFind/find.

But the disadvantages are serious, most notably generic code that 
uses `in` and assumes it has some complexity guarantee would 
experience a blow-up in complexity when an array is passed.

Further, adding syntax sugar to such an operation sends the wrong 
signal and is thus asking for trouble during code maintenance - 
initially the array might have been small and use of `in` 
judicious, but later down the line the array grows while the use 
of `in` is preserved because it *looks* low-cost (after all, it 
looks identical to AA lookup). Alternatively, it can be a 
teaching problem; by having the scalable AA lookup and the 
non-scalable array lookup use the same syntax, a beginner could 
produce really bad code as a result of the hidden but vital 
difference.

(Also, for many typical element types, a length of 3-4 is an 
extremely conservative measure. The number where an array starts 
to be slower can be much, much higher than that, but the idea is 
that scalability is initially more important than 
micro-optimization [unless it's a C program, in which case using 
anything but an array is often not even worth the effort])