Making all strings UTF ranges has some risk of WTF

Andrei Alexandrescu SeeWebsiteForEmail at erdani.org
Wed Feb 3 20:35:31 PST 2010


Walter Bright wrote:
> Andrei Alexandrescu wrote:
>> It's no secret that string et al. are not a magic recipe for writing 
>> correct Unicode code.
> 
> I'm concerned it would be slow. Most operations on strings do not need 
> to decode the unicode characters, for example, find, startsWith, etc., 
> do not. Decoding then doing find, startsWith, etc., will be considerably 
> slower.

I thought you're going to say that, but fortunately it's easy to 
special-case certain algorithms for strings during compilation. In fact 
I already did - for example, Boyer-Moore searching would be very 
difficult to rewrite for variable-length characters, but there's no need 
for it. I special-cased that algorithm.

I believe this is a good strategy.

Andrei



More information about the Digitalmars-d mailing list