[phobos] UTF-8 string slicing

Jacob Carlborg doob at me.com
Sat Aug 20 05:20:19 PDT 2011


On 20 aug 2011, at 11:38, unDEFER wrote:

> On Sat, 20 Aug 2011 06:49:33 +0400, Walter Bright <walter at digitalmars.com> wrote:
> 
>> There isn't any getting away from understanding that UTF-8 is a multi-byte encoding.
> 
> If it is so, then arr.popFront() must break UTF-8 strings ;-)
> 
>> If you want to use an encoding with a 1:1 correspondence between indices and characters, use dchar encoding.
> 
> For me use in 4 times more memory for ASCII seems too wasteful, sorry.


if you know for sure that a string will only contain ASCII you can use .length and the built-in slicing syntax. I doesn't matter what type of encoding you use with unicode, a non-ascii character will take up more than one byte.

-- 
/Jacob Carlborg



More information about the phobos mailing list