Making all strings UTF ranges has some risk of WTF
Andrei Alexandrescu
SeeWebsiteForEmail at erdani.org
Thu Feb 4 09:19:42 PST 2010
bearophile wrote:
> Simen kjaeraas:
>> Of the above, I feel (b) is the correct solution, and I understand
>> it has already been implemented in svn.
>
> Yes, I presume he was mostly looking for a justification of his ideas
> he has already accepted and even partially implemented :-)
I am ready to throw away the implementation as soon as a better idea
comes around. As other times, I operated the change to see how things
feel with the new approach.
Generally it feels like the new state of affairs is a solid improvement.
One recurring problem has been that some code has assumed that
ElementType!SomeString has the width of one encoding unit. That
assumption is no longer true so I had to change such code with
typeof(SomeString.init[0]). Probably I'll abstract that as
CodeUnit!SomeString in std.traits.
I also found some bugs; for example Levenshtein distance was erroneous
because it didn't operate at character level. The fix using front and
popFront was very simple.
Regarding defining an entire new struct for strings, I think that's a
sensible approach. With the new operators in tow, UString (universal
string) that traffics in dchar and makes representation a detail would
be nicely implementable. It could even have mutable elements at dchar
granularity. My feeling is, however, that at this point too much
toothpaste is out of the tube for that to happen in D2. That would be
offset if current strings were unbearable, but I think they're working
very well.
Andrei
More information about the Digitalmars-d
mailing list