VLERange: a range in between BidirectionalRange and RandomAccessRange

Thu Jan 13 08:52:26 PST 2011

On Tue, 11 Jan 2011 18:00:30 -0500, Andrei Alexandrescu  
<SeeWebsiteForEmail at erdani.org> wrote:

> On 1/11/11 11:21 AM, Steven Schveighoffer wrote:

>> It is supposed to be simple, and provide the expected interface, without
>> causing any undue performance degradation. That is, I should be able to
>> do all the things with a replacement string type that I can with a char
>> array today, as efficiently as I can today, except I should have to work
>> to get at the code-units. The huge benefit is that I can say "I'm
>> dealing with this as an array" when I know it's safe
>
> Unfinished sentence?

Sorry, I forgot '.' :)

> Anyway, for my money you just described what we have now.

All except the 'expected interface' part.  The string type should deal  
with dchars exclusively, since that's what it is a range of.  char[] gives  
you char's back when you index it.  Anyone who doesn't use ASCII will be  
confused by this.

Also, I expect to be able to use a char[] as an array, which Phobos  
doesn't let me in some cases (e.g. sorting ASCII character array).

>
>> The disagreement will never be fully solved, as there is just as much
>> disagreement about the current state of affairs ;) e.g. should foreach
>> default to using dchar?
>
> I disagree about the disagreement being unsolvable. I'm not rigid; if I  
> saw a terrific abstraction in your string, I'd be all for it. It just  
> shuffles some issues about, and although I agree it does one thing or  
> two better than char[], at the end of the day it doesn't carry its  
> weight.

I see it as having two vast improvements:

1. If we replace char[] with a specific type for string, then char[] can  
be considered a true array by phobos, and phobos can now deal with a  
char[] array without the need to cast.
2. It protects the casual user from incorrectly using a string by making  
the default the correct API.

Those to me are very important.

>
>> I don't think I'll ever be 'happy' with the way strings sit in phobos
>> currently. I typically deal in ASCII (i.e. code units), and phobos works
>> very hard to prevent that.
>
> I wonder if we could and should extend some of the functions in  
> std.string to work with ubyte[]. I did add a function called  
> representation() that I didn't document yet. Essentially representation  
> gives you the ubyte[], ushort[], or uint[] underneath a string, with the  
> same qualifiers. Whenever you want an algorithm to work on ASCII in  
> earnest, you can pass representation(s) to it instead of s.

This, again, fails on point 2 above.  A char[] is an array, and allows  
access to code-units, which is not the correct interface for a string.   
Supporting ubyte[] doesn't fix that problem.  Correct as the default is  
usually a theme in D...

> If you work a lot with ASCII, an AsciiString abstraction may be a better  
> and more likely to be successful string type. Better yet, you could  
> simply focus on AsciiChar and then define ASCII strings as arrays of  
> AsciiChar.

This seems like the wrong approach.  Adding a new type does not fix the  
problems with the original type.  We need to replace the original type or  
at least how it is treated by the compiler.

-Steve