Ceci n'est pas une char

Walter Bright newshound at digitalmars.com
Thu Apr 6 18:27:18 PDT 2006


Sean Kelly wrote:
> Walter Bright wrote:
>> Thomas Kuehne wrote:
>>> Challenge:
>>> Provide a D implementation that firsts converts to UTF-32 and has
>>> shorter runtime than the code below:
>>
>> I don't know about that, but the code below isn't optimal <g>. Replace 
>> the sar's with a lookup of the 'stride' of the UTF-8 character (see 
>> std.utf.UTF8stride[]). An implementation is std.utf.toUTFindex().
> 
> I've been wondering about this.  Will 'stride' be accurate for any 
> arbitrary string position or input data?  I would assume so, but don't 
> know enough about how UTF-8 is structured to be sure.

UTF8stride[] will give 0xFF for values that are not at the beginning of 
a valid UTF-8 sequence.



More information about the Digitalmars-d mailing list