Get Character At?

Derek Parnell derek at psych.ward
Tue Apr 24 16:26:26 PDT 2007


On Tue, 24 Apr 2007 11:56:19 -0400, okibi wrote:

> Derek Parnell Wrote:
> 
>> On Tue, 24 Apr 2007 10:30:16 -0400, okibi wrote:
>> 
>>> Is there a getCharAt() function for D?
>> 
>> Get a character from what? A string, a file, a console screen, ... ?
>> 
>> -- 
>> Derek Parnell
>> Melbourne, Australia
>> "Justice for David Hicks!"
>> skype: derek.j.parnell
> 
> Such as this:
> 
> char[] text = "This is a test sentence.";
> 
> int loc = 5;
> 
> char num5 = text.getCharAt(loc);
> 
> Something along those lines.

Because char[] represents a UTF-8 encoded unicode string, to get the Nth
character (first character is a position 1), try this ...

   import std.stdio;
   import std.utf;

   T getCharAt(T)(T pText, uint pPos)
   {
       size_t lUTF_Index;
       uint   lStride;

       // Firstly, find out where the character starts in the string.
       lUTF_Index = std.utf.toUTFindex(pText, pPos-1);

       // Then find out its width (in bytes)
       lStride = std.utf.stride(pText, lUTF_Index);

       // Return the character encoded in UTF format.
       return pText[lUTF_Index .. lUTF_Index + lStride];
  }

  void main()
  {
    char[] text = "a\ua034bcdef";
    uint loc = 4;
    writefln("%s", getCharAt(text, loc)); // shows "c"
    writefln("%s", text[loc-1]); // correctly fails
  }


If you just use 'text[loc]', you may not get the correct character, and you
actually only get a UTF code point fragment anyway.

Remember that char[] is not an array of characters. It is an array of UTF-8
code point fragments (each 1-byte wide) and a UTF-8 encoded character (code
point) can have from 1 to 4 fragments.
 
-- 
Derek Parnell
Melbourne, Australia
"Justice for David Hicks!"
skype: derek.j.parnell


More information about the Digitalmars-d-learn mailing list