[Issue 1357] Cannot use FFFF and FFFE in Unicode escape sequences.

d-bugmail at puremagic.com d-bugmail at puremagic.com
Sun Sep 30 13:07:40 PDT 2007


http://d.puremagic.com/issues/show_bug.cgi?id=1357





------- Comment #5 from aziz.kerim at gmail.com  2007-09-30 15:07 -------
As I wrote my own encoding/decoding functions for Unicode characters I found
out that certain Unicode codepoints are not allowed to be encoded as UTF-8
sequences. I'm quoting from here:
http://www.cl.cam.ac.uk/~mgk25/unicode.html

"Also note that the code positions U+D800 to U+DFFF (UTF-16 surrogates) as well
as U+FFFE and U+FFFF must not occur in normal UTF-8 or UCS-4 data. UTF-8
decoders should treat them like malformed or overlong sequences for safety
reasons."

So the behaviour of the compiler is actually correct, and Phobos has a bug.


-- 



More information about the Digitalmars-d-bugs mailing list