DMD: invalid UTF character `\U0000d800`
Jacob Carlborg
doob at me.com
Sat Nov 7 17:49:54 UTC 2020
On Saturday, 7 November 2020 at 16:12:06 UTC, Per Nordlöw wrote:
> CtoLexer_parser.d 665 57 error invalid UTF
> character \U0000d800
> CtoLexer_parser.d 665 67 error invalid UTF
> character \U0000dbff
> CtoLexer_parser.d 666 28 error invalid UTF
> character \U0000d800
> CtoLexer_parser.d 666 38 error invalid UTF
> character \U0000dbff
> CtoLexer_parser.d 666 53 error invalid UTF
> character \U0000dc00
> CtoLexer_parser.d 666 63 error invalid UTF
> character \U0000dfff
>
> Doesn't DMD support these Unicodes yet?
They're not valid:
"The Unicode standard permanently reserves these code point
values for UTF-16 encoding of the high and low surrogates, and
they will never be assigned a character, so there should be no
reason to encode them. The official Unicode standard says that no
UTF forms, including UTF-16, can encode these code points" [1].
"... the standard states that such arrangements should be treated
as encoding errors" [1].
Perhaps they need to be combined with other code points to form a
valid character.
[1] https://en.wikipedia.org/wiki/UTF-16#U+D800_to_U+DFFF
--
/Jacob Carlborg
More information about the Digitalmars-d-learn
mailing list