std.algorithm.remove and principle of least astonishment

Don nospam at nospam.com
Wed Nov 24 04:39:19 PST 2010


Andrei Alexandrescu wrote:
> On 11/22/10 12:01 PM, Steven Schveighoffer wrote:
>> On Mon, 22 Nov 2010 12:40:16 -0500, Andrei Alexandrescu
>> <SeeWebsiteForEmail at erdani.org> wrote:
>>
>>> On 11/22/10 11:22 AM, Steven Schveighoffer wrote:
>>
>>>> You're dodging the question. You claim that if I want to use it as an
>>>> array, I use it as an array, if I want to use it as a range, use it 
>>>> as a
>>>> range. I'm simply pointing out why you can't use it as an array --
>>>> because phobos treats it as a bidirectional range, and you can't force
>>>> it to do what you want.
>>>
>>> Of course you can. After you were to admit that it makes next to no
>>> sense to sort an array of code units, I would have said "well if
>>> somehow you do imagine such a situation, you achieve that by saying
>>> what you means: cast the char[] to ubyte[] and sort that".
>>
>> That wasn't what you said -- you said I can use char[] as an array if I
>> want to use it as an array, not that I can use ubyte[] as an array
>> (nobody disputes that).
> 
> That still stays valid. The thing is, sort doesn't sort arrays, it sorts 
> random-access ranges.
> 
>>>> The thing is, *only* when one wants to create strings, does one want to
>>>> view the data type as a bidirectional string. When one wants to deal
>>>> with chars as an element of a container, I don't want to be restricted
>>>> to utf requirements.
>>>
>>> If you don't want to be restricted to utf requirements, use ubyte and
>>> ushort. You're saying "I want to use UTF code points without any
>>> associated UTF meaning".
>> And
>> easy to understand means easier to avoid mistakes. The point is, the
>> domain of valid elements in my application is defined by me, not by the
>> library. The library is making assumptions that my poker hands may
>> contain utf8 characters, while I know in my case they cannot.
> 
> Then what's wrong with ubyte? Why do you encode as UTF something that 
> you know isn't UTF? 

> Would you put an integral in a real even though you 
> know it's only integral?
I don't think that's a valid comparison, since we have integer types, 
but we don't have ASCII types.

Here's the issue as I see it: there are very common use cases (and lots 
of existing C code) for a type which stores an ASCII character.

I think we're seeing the exact same issue that causes to people to 
mistakenly use 'uint' when they mean 'positive integer'.
It LOOKS as though a char is a subset of dchar (ie, a dchar in the range 
0..0x7F).
It LOOKS as though a uint is a subset of int (ie, an int in the range 
0..int.max).

But in both cases, the possibility that the high bit could be set, 
changes the semantics.


More information about the Digitalmars-d mailing list