If invalid string should crash(was:string need to be robust)
Kagamin
spam at here.lot
Mon Mar 14 03:14:34 PDT 2011
Jussi Jumppanen Wrote:
> %u Wrote:
>
> > I agree with a), but not b), Can't find anything in unicode standard says
> > you can use the low surrogate like that
>
> According to: http://www.cl.cam.ac.uk/~mgk25/
>
> According to ISO 10646-1:2000, sections D.7 and 2.3c, a device
> receiving UTF-8 shall interpret a "malformed sequence in the same way
> that it interprets a character that is outside the adopted subset" and
> "characters that are not within the adopted subset shall be indicated
> to the user" by a receiving device. A quite commonly used approach in
> UTF-8 decoders is to replace any malformed UTF-8 sequence by a
> replacement character (U+FFFD), which looks a bit like an inverted
> question mark, or a similar symbol.
>
> Refer to this file for the above quote:
>
> http://www.cl.cam.ac.uk/~mgk25/ucs/examples/UTF-8-test.txt
Sounds like a text rendering guideline rather than a text processing guideline.
More information about the Digitalmars-d
mailing list