Get Character At?

Wed Apr 25 08:21:22 PDT 2007

Frits van Bommel wrote:
> Derek Parnell wrote:
>> On Wed, 25 Apr 2007 13:41:25 +1000, Daniel Keep wrote:
>>
>>> Incidentally, I don't suppose you know anything about the relative
>>> performance of your method up there ^^ and the one in my article down
>>> here vv:
>>
>> It seems that your routine is about 3 times slower than the one I had
>> shown. Here is my test program ... I modified your routine slightly
>> because
>> the idiom "if (x++ == n)" is a dangerous one as it is unclear if 'x' gets
>> incremented before or after the comparision. I changed it to be more
>> clear.
> 
> How is it unclear? Postfix-increment clearly means that the value before
> incrementation is returned (and thus compared to n in that expression).
> 
>> I also changed my routine to output a dchar rather than a char[] and to
>> test for invalid position input.
>>
>> //-----------------------------
>> import std.perf;
>> import std.stdio;
>> import std.utf;
>>
>>
>>  dchar getCharAt(T)(T pText, int pPos)
>>  {
>>        size_t lUTF_Index;
>>        uint   lStride;
>>
>>        if (pPos < 0 || pPos >= pText.length)
>>         return dchar.init;
>>        // Firstly, find out where the character starts in the string.
>>        lUTF_Index = std.utf.toUTFindex(pText, pPos);
>>
> 
> 
>>        // Then find out its width (in bytes)
>>        lStride = std.utf.stride(pText, lUTF_Index);
>>
>>        // Return the character encoded in UTF format.
>>        return std.utf.toUTF32(
>>                 pText[lUTF_Index .. lUTF_Index + lStride])[0];
> 
> I think you can change these last two statements to just:
> ---
>     return pText.decode(lUTF_Index);
> ---
> (that's std.utf.decode, just to be clear)
> That changes the index variable passed, but that doesn't matter here.
> 
>> }
> [snip]
>> //-----------------------------
>>
>> On my machine (Intel Core 2 6600 @ 2.40GHz, 2GB RAM) I got this result
>> ...
>>
>> c:\temp>test
>> Derek Parnell:    7939664
>>   Daniel Keep:   26683373
> 
> With mine added: (and obviously on _my_ machine)
> ---
> urxae at urxae:~/tmp$ dmd -O -release -inline -run test.d
>    Derek Parnell:   17693368
>      Daniel Keep:   54037341
> Frits van Bommel:   12045495
> urxae at urxae:~/tmp$ gdc -O3 -finline -frelease -o test test.d && ./test
>    Derek Parnell:   19567337
>      Daniel Keep:   26750383
> Frits van Bommel:   14332419
> ---
> (My machine & compilers: AMD Sempron 3200+, 1GB RAM, 64-bit Ubuntu 6.10,
> running DMD 1.013 and GDC 0.23/x86_64)
> 
> So my version is even faster (about 30%), at least on my machine. And
> IMHO it's also more readable. No need to know what "stride" is, for
> example.

Yoikes!  I'm rather amazed that the "simple" foreach method is that much
slower.  I'll add the faster version to the article as soon as I get the
chance.

Thanks, guys.

	-- Daniel

-- 
int getRandomNumber()
{
    return 4; // chosen by fair dice roll.
              // guaranteed to be random.
}

http://xkcd.com/

v2sw5+8Yhw5ln4+5pr6OFPma8u6+7Lw4Tm6+7l6+7D
i28a2Xs3MSr2e4/6+7t4TNSMb6HTOp5en5g6RAHCP  http://hackerkey.com/