Proposal for fixing dchar ranges

Steven Schveighoffer schveiguy at yahoo.com
Mon Mar 10 06:35:44 PDT 2014


I proposed this inside the long "major performance problem with  
std.array.front," I've also proposed it before, a long time ago.

But seems to be getting no attention buried in that thread, not even  
negative attention :)

An idea to fix the whole problems I see with char[] being treated  
specially by phobos: introduce an actual string type, with char[] as  
backing, that is a dchar range, that actually dictates the rules we want.  
Then, make the compiler use this type for literals.

e.g.:

struct string {
    immutable(char)[] representation;
    this(char[] data) { representation = data;}
    ... // dchar range primitives
}

Then, a char[] array is simply an array of char[].

points:

1. No more issues with foreach(c; "cassé"), it iterates via dchar
2. No more issues with "cassé"[4], it is a static compiler error.
3. No more awkward ASCII manipulation using ubyte[].
4. No more phobos schizophrenia saying char[] is not an array.
5. No more special casing char[] array templates to fool the compiler.
6. Any other special rules we come up with can be dictated by the library,  
and not ignored by the compiler.

Note, std.algorithm.copy(string1, mutablestring) will still decode/encode,  
but it's more explicit. It's EXPLICITLY a dchar range. Use  
std.algorithm.copy(string1.representation, mutablestring.representation)  
will avoid the issues.

I imagine only code that is currently UTF ignorant will break, and that  
code is easily 'fixed' by adding the 'representation' qualifier.

-Steve


More information about the Digitalmars-d mailing list