[Issue 1357] Cannot use FFFF and FFFE in Unicode escape sequences.
d-bugmail at puremagic.com
d-bugmail at puremagic.com
Sun Sep 30 13:07:40 PDT 2007
http://d.puremagic.com/issues/show_bug.cgi?id=1357
------- Comment #5 from aziz.kerim at gmail.com 2007-09-30 15:07 -------
As I wrote my own encoding/decoding functions for Unicode characters I found
out that certain Unicode codepoints are not allowed to be encoded as UTF-8
sequences. I'm quoting from here:
http://www.cl.cam.ac.uk/~mgk25/unicode.html
"Also note that the code positions U+D800 to U+DFFF (UTF-16 surrogates) as well
as U+FFFE and U+FFFF must not occur in normal UTF-8 or UCS-4 data. UTF-8
decoders should treat them like malformed or overlong sequences for safety
reasons."
So the behaviour of the compiler is actually correct, and Phobos has a bug.
--
More information about the Digitalmars-d-bugs
mailing list