Why can't D store all UTF-8 code units in char type? (not really understanding explanation)

Adam D Ruppe destructionator at gmail.com
Fri Dec 2 21:35:14 UTC 2022


On Friday, 2 December 2022 at 21:26:40 UTC, rikki cattermole 
wrote:
> char is always UTF-8 codepoint and therefore exactly 1 byte.
> wchar is always UTF-16 codepoint and therefore exactly 2 bytes.
> dchar is always UTF-32 codepoint and therefore exactly 4 bytes;

You mean "code unit". There's no such thing as a utf-8/16/32 
codepoint. A codepoint is a more abstract concept that is encoded 
in one of the utf formats.



More information about the Digitalmars-d-learn mailing list