Operating with substrings in strings
Oskar Linde
olREM at OVEnada.kth.se
Fri Aug 18 15:29:36 PDT 2006
Frank Benoit wrote:
>
>> Slicing:
>>
>> char[] h = "hello";
>> char[] sub = h[1..3] // Slice the string "hello"
>> writefln(sub); // Prints "el"
>>
>> http://digitalmars.com/d/arrays.html#slicing
>>
>
> I do not know much about UTF8. And I am often not sure if I do string
> processing right. Can someone enlighten me?
>
> If I have
> char[] str = ... some multibyte utf8 chars;
>
> What does str.length give me. The number of bytes or the number of
> characters by looking at every character, which one are multi-bytes?
The number of bytes.
>
> If I do some slicing (str[3..4]), does the indices slice at these byte
> positions and I have the risk of destroying the string or does it look
> at the characters to find the start of the third utf8 character?
It counts the byte positions. And you are correct. You risk splitting in the
middle of a utf-8 code sequence making the string invalid.
>
> Or did I miss something completely?
Not as far as I can tell. :)
/Oskar
More information about the Digitalmars-d-learn
mailing list