toUTFz and WinAPI GetTextExtentPoint32W

Christophe travert at phare.normalesup.org
Tue Sep 20 17:15:54 PDT 2011


Timon Gehr , dans le message (digitalmars.D.learn:29641), a écrit :
>> Last point: WalkLength is not optimized for strings.
>> std.utf.count should be.
>>
>> This short implementation of count was 3 to 8 times faster than
>> walkLength is a simple benchmark:
>>
>> size_t myCount(string text)
>> {
>>    size_t n = text.length;
>>    for (uint i=0; i<text.length; ++i)
>>      {
>>        auto s = text[i]>>6;
>>        n -= (s>>1) - ((s+1)>>2);
>>      }
>>    return n;
>> }
>>
>> (compiled with gdc on 64 bits, the sample text was the introduction of
>> french wikipedia UTF-8 article down to the sommaire -
>> http://fr.wikipedia.org/wiki/UTF-8 ).
>>
>> The reason is that the loop can be unrolled by the compiler.
> 
> Very good point, you might want to file an enhancement request. It would 
> make the functionality different enough to prevent count from being 
> removed: walkLength throws on an invalid UTF sequence.

I would be glad to do so, but I am quite new here, so I don't know how 
to. A little pointer could help.

-- 
Christophe


More information about the Digitalmars-d-learn mailing list