Making all strings UTF ranges has some risk of WTF
Andrei Alexandrescu
SeeWebsiteForEmail at erdani.org
Wed Feb 3 20:35:31 PST 2010
Walter Bright wrote:
> Andrei Alexandrescu wrote:
>> It's no secret that string et al. are not a magic recipe for writing
>> correct Unicode code.
>
> I'm concerned it would be slow. Most operations on strings do not need
> to decode the unicode characters, for example, find, startsWith, etc.,
> do not. Decoding then doing find, startsWith, etc., will be considerably
> slower.
I thought you're going to say that, but fortunately it's easy to
special-case certain algorithms for strings during compilation. In fact
I already did - for example, Boyer-Moore searching would be very
difficult to rewrite for variable-length characters, but there's no need
for it. I special-cased that algorithm.
I believe this is a good strategy.
Andrei
More information about the Digitalmars-d
mailing list