Proposal for fixing dchar ranges
John Colvin
john.loughran.colvin at gmail.com
Mon Mar 10 14:46:23 PDT 2014
On Monday, 10 March 2014 at 13:35:33 UTC, Steven Schveighoffer
wrote:
> I proposed this inside the long "major performance problem with
> std.array.front," I've also proposed it before, a long time ago.
>
> But seems to be getting no attention buried in that thread, not
> even negative attention :)
>
> An idea to fix the whole problems I see with char[] being
> treated specially by phobos: introduce an actual string type,
> with char[] as backing, that is a dchar range, that actually
> dictates the rules we want. Then, make the compiler use this
> type for literals.
>
> e.g.:
>
> struct string {
> immutable(char)[] representation;
> this(char[] data) { representation = data;}
> ... // dchar range primitives
> }
>
> Then, a char[] array is simply an array of char[].
>
> points:
>
> 1. No more issues with foreach(c; "cassé"), it iterates via
> dchar
> 2. No more issues with "cassé"[4], it is a static compiler
> error.
> 3. No more awkward ASCII manipulation using ubyte[].
> 4. No more phobos schizophrenia saying char[] is not an array.
> 5. No more special casing char[] array templates to fool the
> compiler.
> 6. Any other special rules we come up with can be dictated by
> the library, and not ignored by the compiler.
>
> Note, std.algorithm.copy(string1, mutablestring) will still
> decode/encode, but it's more explicit. It's EXPLICITLY a dchar
> range. Use std.algorithm.copy(string1.representation,
> mutablestring.representation) will avoid the issues.
>
> I imagine only code that is currently UTF ignorant will break,
> and that code is easily 'fixed' by adding the 'representation'
> qualifier.
>
> -Steve
just to check I understand this fully:
in this new scheme, what would this do?
auto s = "cassé".representation;
foreach(i, c; s) write(i, ':', c, ' ');
writeln(s);
Currently - without the .representation - I get
0:c 1:a 2:s 3:s 4:e 5:̠6:`
cassé
or, to spell it out a bit more:
0:c 1:a 2:s 3:s 4:e 5:xCC 6:x81
cassé
More information about the Digitalmars-d
mailing list