The length of strings vs. # of chars vs. sizeof
Rainer Deyke
rainerd at eldwood.com
Sun Nov 1 19:08:53 PST 2009
Jesse Phillips wrote:
> I believe the documentation you are looking for is:
>
> http://www.prowiki.org/wiki4d/wiki.cgi?DanielKeep/TextInD
>
> It is more about understanding UTF than it is about learning strings.
One thing that page fails to mention is that D has no awareness of
anything higher-level than code points. In particular:
- dchar contains a code point, not a logical character.
- D has no awareness of canonical forms and precomposed/decomposed
characters (at the language level). (Some characters can be represented
as either one or two code points. D does not know that these are
supposed to represent the same character.)
- Although D stops you from outputting an incomplete code point, it
does not stop you from outputting an incomplete logical character.
Also, some D library functions only work on the ASCII subset of utf-8.
--
Rainer Deyke - rainerd at eldwood.com
More information about the Digitalmars-d-learn
mailing list