toUTFz and WinAPI GetTextExtentPoint32W

Tue Sep 20 16:57:34 PDT 2011

"Jonathan M Davis" , dans le message (digitalmars.D.learn:29637), a
 écrit :
> On Tuesday, September 20, 2011 14:43 Andrej Mitrovic wrote:
>> On 9/20/11, Jonathan M Davis <jmdavisProg at gmx.com> wrote:
>> > Or std.range.walkLength. I don't know why we really have std.utf.count. I
>> > just
>> > calls walkLength anyway. I suspect that it's a function that predates
>> > walkLength and was made to use walkLength after walkLength was
>> > introduced. But
>> > it's kind of pointless now.
>> > 
>> > - Jonathan M Davis
>> 
>> I don't think having better-named aliases is a bad thing. Although now
>> I'm seeing it's not just an alias but a function.
> 

std.utf.count has on advantage: someone looking for the function will 
find it. The programmer might not look in std.range to find a function 
about UFT strings, and even if he did, it is not indicated in walkLength 
that it works with (narrow) strings the way it does. To know you can use 
walklength, you must know that:
-popFront works differently in string.
-hasLength is not true for strings.
-what is walkLength.

So yes, you experienced programmer don't need std.utf.count, but newbies 
do.

Last point: WalkLength is not optimized for strings.
std.utf.count should be.

This short implementation of count was 3 to 8 times faster than 
walkLength is a simple benchmark:

size_t myCount(string text)
{
  size_t n = text.length;
  for (uint i=0; i<text.length; ++i)
    {
      auto s = text[i]>>6;
      n -= (s>>1) - ((s+1)>>2);
    }
  return n;
}

(compiled with gdc on 64 bits, the sample text was the introduction of 
french wikipedia UTF-8 article down to the sommaire - 
http://fr.wikipedia.org/wiki/UTF-8 ).

The reason is that the loop can be unrolled by the compiler.