Selectable encodings
Anders F Björklund
afb at algonet.se
Thu Apr 6 08:12:39 PDT 2006
James Dunne wrote:
> The char type is really a misnomer for dealing with UTF-8 encoded
> strings. It should be named closer to "code-unit for UTF-8 encoding".
Yeah, but it does hold an *ASCII* character ?
Usually the D code handles char[] with dchar,
but with a "short path" for ASCII characters...
> I could be wrong (and I bet I am) on the terminology used to describe
> char, but I really mean it to just store a full Unicode character
> such that strings of chars can safely assume character index == array
> index.
For the general case, UTF-32 is a pretty wasteful
Unicode encoding just to have that priviledge ?
See http://www.unicode.org/faq/utf_bom.html#12
--anders
More information about the Digitalmars-d
mailing list