Get Character At?

Daniel Keep daniel.keep.lists at gmail.com
Tue Apr 24 20:41:25 PDT 2007



Derek Parnell wrote:
> On Tue, 24 Apr 2007 11:56:19 -0400, okibi wrote:
> 
>> Derek Parnell Wrote:
>>
>>> On Tue, 24 Apr 2007 10:30:16 -0400, okibi wrote:
>>>
>>>> Is there a getCharAt() function for D?
>>> Get a character from what? A string, a file, a console screen, ... ?
>>>
>>> -- 
>>> Derek Parnell
>>> Melbourne, Australia
>>> "Justice for David Hicks!"
>>> skype: derek.j.parnell
>> Such as this:
>>
>> char[] text = "This is a test sentence.";
>>
>> int loc = 5;
>>
>> char num5 = text.getCharAt(loc);
>>
>> Something along those lines.
> 
> Because char[] represents a UTF-8 encoded unicode string, to get the Nth
> character (first character is a position 1), try this ...
> 
>    import std.stdio;
>    import std.utf;
> 
>    T getCharAt(T)(T pText, uint pPos)
>    {
>        size_t lUTF_Index;
>        uint   lStride;
> 
>        // Firstly, find out where the character starts in the string.
>        lUTF_Index = std.utf.toUTFindex(pText, pPos-1);
> 
>        // Then find out its width (in bytes)
>        lStride = std.utf.stride(pText, lUTF_Index);
> 
>        // Return the character encoded in UTF format.
>        return pText[lUTF_Index .. lUTF_Index + lStride];
>   }
> 
>   void main()
>   {
>     char[] text = "a\ua034bcdef";
>     uint loc = 4;
>     writefln("%s", getCharAt(text, loc)); // shows "c"
>     writefln("%s", text[loc-1]); // correctly fails
>   }
> 
> 
> If you just use 'text[loc]', you may not get the correct character, and you
> actually only get a UTF code point fragment anyway.
> 
> Remember that char[] is not an array of characters. It is an array of UTF-8
> code point fragments (each 1-byte wide) and a UTF-8 encoded character (code
> point) can have from 1 to 4 fragments.

I was going to post a link to my old Text In D article[1], but I guess
that'd be redundant now :P

Incidentally, I don't suppose you know anything about the relative
performance of your method up there ^^ and the one in my article down
here vv:

> dchar nthCharacter(char[] string, int n)
> {
>     int curChar = 0;
>     foreach( dchar cp ; string )
>         if( curChar++ == n )
>             return cp;
>     return dchar.init;
> }

I'm curious since I don't want to recommend a slow solution if I can
help it :)

	-- Daniel

[1] http://www.prowiki.org/wiki4d/wiki.cgi?DanielKeep/TextInD

-- 
int getRandomNumber()
{
    return 4; // chosen by fair dice roll.
              // guaranteed to be random.
}

http://xkcd.com/

v2sw5+8Yhw5ln4+5pr6OFPma8u6+7Lw4Tm6+7l6+7D
i28a2Xs3MSr2e4/6+7t4TNSMb6HTOp5en5g6RAHCP  http://hackerkey.com/


More information about the Digitalmars-d-learn mailing list