Why can't D store all UTF-8 code units in char type? (not really understanding explanation)

rikki cattermole rikki at cattermole.co.nz
Fri Dec 2 21:44:22 UTC 2022


On 03/12/2022 10:35 AM, Adam D Ruppe wrote:
> On Friday, 2 December 2022 at 21:26:40 UTC, rikki cattermole wrote:
>> char is always UTF-8 codepoint and therefore exactly 1 byte.
>> wchar is always UTF-16 codepoint and therefore exactly 2 bytes.
>> dchar is always UTF-32 codepoint and therefore exactly 4 bytes;
> 
> You mean "code unit". There's no such thing as a utf-8/16/32 codepoint. 
> A codepoint is a more abstract concept that is encoded in one of the utf 
> formats.

Yeah you're right, its code unit not code point.



More information about the Digitalmars-d-learn mailing list