Operating with substrings in strings

Fri Aug 18 13:57:21 PDT 2006

Oskar Linde schrieb:
> Frank Benoit wrote:
>> What does str.length give me. The number of bytes or the number of
>> characters by looking at every character, which one are multi-bytes?
> 
> The number of bytes.
> 
>> If I do some slicing (str[3..4]), does the indices slice at these byte
>> positions and I have the risk of destroying the string or does it look
>> at the characters to find the start of the third utf8 character?
> 
> It counts the byte positions. And you are correct. You risk splitting in the
> middle of a utf-8 code sequence making the string invalid. 
> 
> /Oskar

char is a utf8 character. Where is the difference to ubyte or
'ascii/latin1/...' char if there is no native support?

If the functionality is in a lib like phobos std.utf, ubyte/ushort/uint
would work also. (Ok, the init values are different, but I hope that is
not all).

Is dchar (utf32) the only save option to easily work with strings in a
correct way?