Working with utf

Simen Haugen simen at norstat.no
Thu Jun 14 06:12:22 PDT 2007


"Regan Heath" <regan at netmail.co.nz> wrote in message 
news:f4rd7m$dlb$1 at digitalmars.com...
> I think what we want for this is a String class which internally stores 
> the data as utf-8, 16 or 32 (making it's own decision or being told which 
> to use) and provides slicing of characters as opposed to codpoints.
>
> Then, all you need is to convert from latin-1 to String, do all your work 
> with String and convert back to latin-1 only if/when you need to write it 
> back to a file or similar.
>
> My gut feeling is that this functionality belongs in a class and not the 
> language itself.  After all, you may want/need to manipulate utf-8, 16, or 
> 32 codepoints directly for some reason.
>
> Regan Heath

That would have been a very nice addition. I cannot even count how many 
hard-to-find bugs I've had because of this (both slicing and length).

Utf8 and slicing is supported by the language, right? To me it sounds more 
like a bug that these wont work together, as I tend to trust that language 
features work. 





More information about the Digitalmars-d mailing list