std.algorithm.remove and principle of least astonishment

Andrei Alexandrescu SeeWebsiteForEmail at erdani.org
Sat Nov 20 15:58:33 PST 2010


On 11/20/10 12:32 PM, Rainer Deyke wrote:
> On 11/20/2010 05:12, spir wrote:
>> On Fri, 19 Nov 2010 22:04:51 -0700 Rainer Deyke<rainerd at eldwood.com>
>> wrote:
>>> You don't see the advantage of generic types behaving in a generic
>>> manner?  Do you know how much pain std::vector<bool>  caused in
>>> C++?
>>>
>>> I asked this before, but I received no answer.  Let me ask it
>>> again. Imagine a container Vector!T that uses T[] internally.  Then
>>> consider Vector!char.  What would be its correct element type?
>>> What would be its correct behavior during iteration?  What would be
>>> its correct response when asked to return its length?  Assuming you
>>> come up with a coherent set of semantics for Vector!char, how would
>>> you implement it?  Do you see how easy it would be to implement it
>>> incorrectly?
>>
>> Hello Rainer,
>>
>> The original proposal by Bruno would simplify some project I have in
>> mind (namely, a higher-level universal text type already evoked). The
>> issues you point to intuitively seem relevant to me, but I cannot
>> really understand any. Would be kind enough and expand a bit on each
>> question? (Thinking at people who about nothing of C++ -- yes, they
>> exist ;-)
>
> std::vector<bool>  in C++ is a specialization of std::vector that packs
> eight booleans into a byte instead of storing each element separately.
> It doesn't behave exactly like other std::vectors and technically
> doesn't meet the C++ requirements of a container, although it tries to
> come as close as possible.  This means that any code that uses
> std::vector<bool>  needs to be extra careful to take those differences in
> account.  This is especially an issue when dealing with generic code
> that uses std::vector<T>, where T may or may not be bool.
>
> The issue with Vector!char is similar.  Because char[] is not a true
> array, generic code that uses T[] can unexpectedly fail when T is char.
>   Other containers of char behave like normal containers, iterating over
> individual chars.  char[] iterates over dchars.  Vector!char can,
> depending on its implementation, iterate over chars, iterate over
> dchars, or fail to compile at all when instantiated with T=char.  It's
> not even clear which of these is the correct behavior.

The parallel does not stand scrutiny. The problem with vector<bool> in 
C++ is that it implements no formal abstraction, although it is a 
specialization of one.

D strings exhibit no such problems. They expose their implementation - 
array of code units. Having that available is often handy. They also 
obey a formal interface - bidirectional ranges.

> Vector!char is just an example. Any generic code that uses T[] can
> unexpectedly fail to compile or behave incorrectly used when T=char.
> If I were to use D2 in its present state, I would try to avoid both
> char/wchar and arrays as much as possible in order to avoid this
> trap. This would mean avoiding large parts of Phobos, and providing
> safe wrappers around the rest.

It may be wise in fact to start using D2 and make criticism grounded in 
reality that could help us improve the state of affairs. The above is 
only fallacious presupposition. Algorithms in Phobos are abstracted on 
the formal range interface, and as such you won't be exposed to risks 
when using them with strings.


Andrei


More information about the Digitalmars-d mailing list