"in" everywhere

Thu Oct 7 12:40:15 PDT 2010

Andrei Alexandrescu:

> I'm a bit leary of adopting this feature (it has been discussed). To me 
> "in" implies a fast operation and substring searching isn't quite it.
> 
> One thing that could be done is to allow "in" with literal arrays to 
> their right:
> 
> if (x in ["abcde", "asd"]) { ... }
> 
> The size of the operand is constant, known, and visible.

The "in" is useful to tell if an item is present in a collection (dynamic arrays, fixed-sized arrays, sorted arrays, tuple, typetuple, hash set, tree set, hash map, tree map, trees, graph, deque, skip list, finger tree, etc), and it's useful to tell if a substring is present in a string. Those two purposes are very commonly useful in programs.

Despite those purposes are very common, there is no need for an "in" syntax. In dlibs1 I have created an isIn() function that was good enough.

D supports the "in" operator for associative arrays, and in its operator overloading (my Set()/set() implementation supports the "in" operator), so it's natural for people to desire that operator to be usable for the other kind of collections too, like arrays. Maybe those people have used Python, where the in operator (__contains__) is widey used.

Regarding generic programming, Python doesn't have templates, but it has dynamic typing, that also has purposes similar to generic programming. To a python function that uses the "in" operator you may give both a list and an associative array, or other objects that support the __contains__. In this case the Python code performs an O(n) or O(1) operation according to the dynamic type given to the function.

I can live without the "in" operation for arrays and strings (useful to look for substrings too). So one solution is not not change the D language, and not add support for "in" to arrays.

Another solution is just to accept O(n) as the worst complexity for the "in" operator. I don't understand what's the problem in this.

Another solution is to support the "in" operator for dynamic arrays too, and define a new attribute, like @complexity(), plus an Enum that allows to specify the worst case complexity. So associative arrays are annotated with @complexity(O.linear), while the function that searches for items/substrings in arrays/strings is @complexity(O.constant). At compile-time the generic code is then able to query the computational complexity of the "in" operator implementation. The problem is that the compiler today can't enforce functions annotated with @complexity(O.constant) to actually not perform a linear search (but I don't think it's a problem, today if the Range protocol asks an operation to not be worse than O(n ln n) the compiler doesn't enforce it).

I don't like the idea of allowing the "in" operator for sorted arrays, tuples, fixed-sized arrays, skip lists, maps, sets, trees and disallow it for dynamic arrays. It looks a bit silly. People for the next ten years will then ask "in" to be extended for unsorted dynamic arrays & substring search too, you can bet on it.

Bye,
bearophile