toUTFz and WinAPI GetTextExtentPoint32W

Wed Sep 21 05:41:42 PDT 2011

On 09/21/2011 02:15 AM, Christophe wrote:
> Timon Gehr , dans le message (digitalmars.D.learn:29641), a écrit :
>>> Last point: WalkLength is not optimized for strings.
>>> std.utf.count should be.
>>>
>>> This short implementation of count was 3 to 8 times faster than
>>> walkLength is a simple benchmark:
>>>
>>> size_t myCount(string text)
>>> {
>>>     size_t n = text.length;
>>>     for (uint i=0; i<text.length; ++i)
>>>       {
>>>         auto s = text[i]>>6;
>>>         n -= (s>>1) - ((s+1)>>2);
>>>       }
>>>     return n;
>>> }
>>>
>>> (compiled with gdc on 64 bits, the sample text was the introduction of
>>> french wikipedia UTF-8 article down to the sommaire -
>>> http://fr.wikipedia.org/wiki/UTF-8 ).
>>>
>>> The reason is that the loop can be unrolled by the compiler.
>>
>> Very good point, you might want to file an enhancement request. It would
>> make the functionality different enough to prevent count from being
>> removed: walkLength throws on an invalid UTF sequence.
>
> I would be glad to do so, but I am quite new here, so I don't know how
> to. A little pointer could help.
>

http://d.puremagic.com/issues/

You can tick 'Severity: enhancement request'. Probably it would be best 
if it throws if the final result is larger than text.length though.