std.algorithm.remove and principle of least astonishment

Steven Schveighoffer schveiguy at yahoo.com
Mon Nov 22 10:01:52 PST 2010


On Mon, 22 Nov 2010 12:40:16 -0500, Andrei Alexandrescu  
<SeeWebsiteForEmail at erdani.org> wrote:

> On 11/22/10 11:22 AM, Steven Schveighoffer wrote:

>> You're dodging the question. You claim that if I want to use it as an
>> array, I use it as an array, if I want to use it as a range, use it as a
>> range. I'm simply pointing out why you can't use it as an array --
>> because phobos treats it as a bidirectional range, and you can't force
>> it to do what you want.
>
> Of course you can. After you were to admit that it makes next to no  
> sense to sort an array of code units, I would have said "well if somehow  
> you do imagine such a situation, you achieve that by saying what you  
> means: cast the char[] to ubyte[] and sort that".

That wasn't what you said -- you said I can use char[] as an array if I  
want to use it as an array, not that I can use ubyte[] as an array (nobody  
disputes that).

>> The thing is, *only* when one wants to create strings, does one want to
>> view the data type as a bidirectional string. When one wants to deal
>> with chars as an element of a container, I don't want to be restricted
>> to utf requirements.
>
> If you don't want to be restricted to utf requirements, use ubyte and  
> ushort. You're saying "I want to use UTF code points without any  
> associated UTF meaning".

A literal defining an array of ubytes or ushorts is considerably more  
painful than one of chars.

>> FWIW, I deal in ASCII pretty much exclusively, so sorting an array of
>> char is not out of the question.
>
> Example?

In some poker-hand detection code I've written in C++ (and actually in D  
too) in the past, I can use characters to represent each card.  A  
straightforward way to do this is to add each 'card' to a string, then  
sort the string.  This allows me to use string functions and regex to  
detect hand types.

You can do the same with ubytes, but it's not as easy to understand.  And  
easy to understand means easier to avoid mistakes.  The point is, the  
domain of valid elements in my application is defined by me, not by the  
library.  The library is making assumptions that my poker hands may  
contain utf8 characters, while I know in my case they cannot.  If I could  
convey this in a way that allows me to keep the nice properties of char  
arrays (i.e. printing as strings), then I would be fine with the library  
assuming unless I told it so.

But there is no way currently, the library steadfastly refuses to look at  
it any other way than a utf-8 code sequence.  It doesn't help matters that  
the compiler steadfastly looks at them as arrays.

What I want is for the compiler *and* the library to look at strings as  
not arrays, and for both to look at char[] as an array.  So I can clearly  
define my intent of how I want them to treat such variables.

>> I'm going to drop out of this discussion in order to develop a viable
>> alternative to using arrays to represent strings. Then we can discuss
>> the merits/drawbacks of such a type. I think it will be simple to build.

Here I am continuing to argue.  I swear I'll stop after this :)  At least  
until I have my string type ready.

-Steve


More information about the Digitalmars-d mailing list