UTF-8 issues

Jarrett Billingsley jarrett.billingsley at gmail.com
Tue Sep 16 15:05:02 PDT 2008


On Tue, Sep 16, 2008 at 4:57 PM, Lutger <lutger.blijdestijn at gmail.com> wrote:
> Jarrett Billingsley wrote:
> ...
>>
>> It's called UTF-8, and it's supposed to work like that.  That D does
>> not provide some kind of interface for dealing with multibyte
>> encodings (other than foreach and the encode/decode functions) is a
>> failing on its part, not Unicode's.
>
> There's also std.string of course. What do you find so lacking? (just
> curious)
>

The lack of any way to index or slice a string according to codepoint
indices (instead of byte/short indices), get the length of a string in
codepoints, or to find the nearest beginning character given an
arbitrary character index.  (std.string is also embarrassingly missing
any functionality for wchar[] or dchar[] but that's a slightly
different issue.)



More information about the Digitalmars-d mailing list