VLERange: a range in between BidirectionalRange and RandomAccessRange
Steven Schveighoffer
schveiguy at yahoo.com
Thu Jan 13 08:52:26 PST 2011
On Tue, 11 Jan 2011 18:00:30 -0500, Andrei Alexandrescu
<SeeWebsiteForEmail at erdani.org> wrote:
> On 1/11/11 11:21 AM, Steven Schveighoffer wrote:
>> It is supposed to be simple, and provide the expected interface, without
>> causing any undue performance degradation. That is, I should be able to
>> do all the things with a replacement string type that I can with a char
>> array today, as efficiently as I can today, except I should have to work
>> to get at the code-units. The huge benefit is that I can say "I'm
>> dealing with this as an array" when I know it's safe
>
> Unfinished sentence?
Sorry, I forgot '.' :)
> Anyway, for my money you just described what we have now.
All except the 'expected interface' part. The string type should deal
with dchars exclusively, since that's what it is a range of. char[] gives
you char's back when you index it. Anyone who doesn't use ASCII will be
confused by this.
Also, I expect to be able to use a char[] as an array, which Phobos
doesn't let me in some cases (e.g. sorting ASCII character array).
>
>> The disagreement will never be fully solved, as there is just as much
>> disagreement about the current state of affairs ;) e.g. should foreach
>> default to using dchar?
>
> I disagree about the disagreement being unsolvable. I'm not rigid; if I
> saw a terrific abstraction in your string, I'd be all for it. It just
> shuffles some issues about, and although I agree it does one thing or
> two better than char[], at the end of the day it doesn't carry its
> weight.
I see it as having two vast improvements:
1. If we replace char[] with a specific type for string, then char[] can
be considered a true array by phobos, and phobos can now deal with a
char[] array without the need to cast.
2. It protects the casual user from incorrectly using a string by making
the default the correct API.
Those to me are very important.
>
>> I don't think I'll ever be 'happy' with the way strings sit in phobos
>> currently. I typically deal in ASCII (i.e. code units), and phobos works
>> very hard to prevent that.
>
> I wonder if we could and should extend some of the functions in
> std.string to work with ubyte[]. I did add a function called
> representation() that I didn't document yet. Essentially representation
> gives you the ubyte[], ushort[], or uint[] underneath a string, with the
> same qualifiers. Whenever you want an algorithm to work on ASCII in
> earnest, you can pass representation(s) to it instead of s.
This, again, fails on point 2 above. A char[] is an array, and allows
access to code-units, which is not the correct interface for a string.
Supporting ubyte[] doesn't fix that problem. Correct as the default is
usually a theme in D...
> If you work a lot with ASCII, an AsciiString abstraction may be a better
> and more likely to be successful string type. Better yet, you could
> simply focus on AsciiChar and then define ASCII strings as arrays of
> AsciiChar.
This seems like the wrong approach. Adding a new type does not fix the
problems with the original type. We need to replace the original type or
at least how it is treated by the compiler.
-Steve
More information about the Digitalmars-d
mailing list