std.algorithm.remove and principle of least astonishment

Sun Nov 21 16:31:03 PST 2010

On 11/21/10 6:12 PM, Rainer Deyke wrote:
> On 11/21/2010 11:23, Andrei Alexandrescu wrote:
>> On 11/20/10 9:42 PM, Rainer Deyke wrote:
>>> On 11/20/2010 16:58, Andrei Alexandrescu wrote:
>>>> The parallel does not stand scrutiny. The problem with
>>>> vector<bool>   in C++ is that it implements no formal
>>>> abstraction, although it is a specialization of one.
>>>
>>> The problem with std::vector<bool>   is that it pretends to be a
>>> std::vector, but isn't.  If it was called dynamic_bitset instead,
>>> nobody would have complained.  char[] has exactly the same
>>> problem.
>>
>> char[] does not exhibit the same issues that vector<bool>  has.
>> The situation is very different, and again, trying to reduce one to
>> another misses a lot of the picture.
>
> I agree that there are differences.  For one thing, if you iterate
> over a std::vector<bool>  you get actual booleans, albeit through an
> extra layer of indirection.  If you iterate over char[] you might get
> chars or you might get dchars depending on the method you use for
> iterating.

This is sensible because a string may be seen as a sequence of code
points or a sequence of code units. Either view is useful.

> char[] isn't the equivalent of std::vector<bool>.  It's worse.
> char[] is the equivalent of a vector<bool>  that keeps the current
> behavior of std::vector<bool>  when iterating through iterators, but
> gives access to bytes of packed booleans when using operator[].

I explained why char[] is better than vector<bool>. Ignoring the
explanation and restating a fallacious conclusion based on an
overstretched parallel does hardly much to push forward the discussion.

Again: code units _are_ well-defined, useful to have access to, and good
for a variety of uses. Please understand this.

>> vector<bool>  hides representation and in doing so becomes
>> non-compliant with vector<T>  which does expose representation.
>> Worse, vector<bool>  is not compliant with any concept, express or
>> implied, which makes vector<bool>  virtually unusable with generic
>> code.
>
> The ways in which std::vector<bool>  differs from any other vector
> are well understood.  It uses proxies instead of true references.
> Its iterators meet the requirements of input/output iterators (or in
> boost terms, readable, writable iterators with random access
> traversal).  Any generic code written with these limitations in mind
> can use std::vector<T>  freely.  (The C++ standard library doesn't
> play nicely with std::vector<bool>, but that's another issue
> entirely.)
>
> std::vector<bool>  is a useful type, it just isn't a std::vector.
> In that respect, its situation is analogous to that of char[].
>
>>>> It may be wise in fact to start using D2 and make criticism
>>>> grounded in reality that could help us improve the state of
>>>> affairs.
>>>
>>> Sorry, but no.  It would take a huge investment of time and
>>> effort on my part to switch from C++ to D.  I'm not going to make
>>> that leap without looking first, and I'm not going to make it
>>> when I can see that I'm about to jump into a spike pit.
>>
>> You may rest assured that if anything, strings are not a problem.
>
> I'm not concerned about strings, I'm concerned about *arrays*.
> Arrays of T, where T may or not be a character type.  I see that you
> ignored my Vector!char example yet again.

I sure have replied to it, but probably my reply hasn't been read.
Please allow me to paste it again:

> When you define your abstractions, you are free to decide how you
> want to go about them. The D programming language makes it
> unequivocally clear that char[] is an array of UTF-8 code units that
> offers a bidirectional range of code points. Same about wchar[]
> (replace UTF-8 with UTF-16). dchar[] is an array of UTF-32 code
> points which are equivalent to code units, and as such is a full
> random-access range.

So it's up to you what Vector!char does. In D char[] is an array of code 
units that can be iterated as a bidirectional range of code points. I 
don't see anything cagey about that.

> Your assurances aren't increasing my confidence in D, they're
> decreasing my confidence in your judgment (and by extension my
> confidence in D).

I prefaced my assurances with logical arguments that I can only assume 
went unread. You are of course free to your opinion (though it would be 
great if it were more grounded in real reasons); the rest of us will 
continue enjoying D2 strings.

Andrei