Why can't D store all UTF-8 code units in char type? (not really understanding explanation)
rikki cattermole
rikki at cattermole.co.nz
Fri Dec 2 21:26:40 UTC 2022
char is always UTF-8 codepoint and therefore exactly 1 byte.
wchar is always UTF-16 codepoint and therefore exactly 2 bytes.
dchar is always UTF-32 codepoint and therefore exactly 4 bytes;
'Ğ' has the value U+011E which is a lot larger than what 1 byte can
hold. You need 2 chars or 1 wchar/dchar.
https://unicode-table.com/en/011E/
More information about the Digitalmars-d-learn
mailing list