De-Referencing A Pointer
xs0
xs0 at xs0.com
Tue Mar 21 15:56:18 PST 2006
James Dunne wrote:
> Correct me on this if I am wrong:
>
> UNICODE is *not* an _encoding_ standard; it is a standard mapping of
> character glyphs to integer values and specifies no requirements for
> storage or encoding.
Well, since you asked - it's not glyphs, but characters :)
http://en.wikipedia.org/wiki/Glyph
> The encoding to which you (and many others) refer to by the name of
> UNICODE is in fact UCS-2, I believe. This is the encoding where the
> Basic Multilingual Plane (BMP) of the Unicode table maps directly onto
> 65536 values.
And here it's more complicated. UCS-2 is exactly what you say, but AFAIK
D uses UTF-16 and so does Windows:
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/intl/unicode_9i79.asp
They're almost the same, except UTF-16 allows surrogate pairs for
encoding other character planes, while UCS-2 doesn't. I think UCS-2 is
somewhat deprecated generally, exactly because of this reason.
xs0
More information about the Digitalmars-d-learn
mailing list