UTF-8 issues

Benji Smith dlanguage at benjismith.net
Mon Sep 15 23:42:03 PDT 2008


Eldar Insafutdinov wrote:
> Yeah - I know that this operations works with bytes rather than chars[]. But it is stated here http://www.digitalmars.com/d/2.0/cppstrings.html explicitly, that strings support slicing:
> 
>> D has the array slice syntax, not possible with C++:
> 
>> char[] s1 = "hello world";
>> char[] s2 = s1[6 .. 11];	// s2 is "world"
> 
> So this example is only correct in case of latin chars, but in general it is wrong for UTF-8 strings.

That's my understanding.

--benji



More information about the Digitalmars-d mailing list