toUTFz and WinAPI GetTextExtentPoint32W
Timon Gehr
timon.gehr at gmx.ch
Wed Sep 21 05:41:42 PDT 2011
On 09/21/2011 02:15 AM, Christophe wrote:
> Timon Gehr , dans le message (digitalmars.D.learn:29641), a écrit :
>>> Last point: WalkLength is not optimized for strings.
>>> std.utf.count should be.
>>>
>>> This short implementation of count was 3 to 8 times faster than
>>> walkLength is a simple benchmark:
>>>
>>> size_t myCount(string text)
>>> {
>>> size_t n = text.length;
>>> for (uint i=0; i<text.length; ++i)
>>> {
>>> auto s = text[i]>>6;
>>> n -= (s>>1) - ((s+1)>>2);
>>> }
>>> return n;
>>> }
>>>
>>> (compiled with gdc on 64 bits, the sample text was the introduction of
>>> french wikipedia UTF-8 article down to the sommaire -
>>> http://fr.wikipedia.org/wiki/UTF-8 ).
>>>
>>> The reason is that the loop can be unrolled by the compiler.
>>
>> Very good point, you might want to file an enhancement request. It would
>> make the functionality different enough to prevent count from being
>> removed: walkLength throws on an invalid UTF sequence.
>
> I would be glad to do so, but I am quite new here, so I don't know how
> to. A little pointer could help.
>
http://d.puremagic.com/issues/
You can tick 'Severity: enhancement request'. Probably it would be best
if it throws if the final result is larger than text.length though.
More information about the Digitalmars-d-learn
mailing list