If invalid string should crash(was:string need to be robust)
ZY Zhou
rinick at GeeeeMail.com
Sun Mar 13 09:22:35 PDT 2011
Hi,
invalid utf8 code always break my program, so I suggest if invalid code in
utf8 need to be converted to dchar, use the low surrogate code
points(DC80~DCFF) instead of crashing the program. But many people here don't
like this idea, you think exception is the right thing. OK, let me ask you a
question:
Do you always try/catch for invalid utf when reading a file?
I believe you don't, you simply don't care.
While the text file is invalid, this use case itself is valid. Should a
browser crash on a web page with charset=utf8 but has invalid utf8 code in it?
Exception doesn't help either, using them in this case is almost like writing
a utf8 decoder yourself.
Anyway, since I'm already using my own utf decoder, I don't care if you agree
with me or not.
But for the following case, it is complete wrong if it crash at line 3:
1: char[] c = [0xA0];
2: string s = c.idup;
3: foreach(dchar d; s){}
The expected result is either:
a) crash at line 2, c is not valid utf
and can't be converted to string
or:
b) don't crash, and d = 0xDCA0;
--ZY Zhou
More information about the Digitalmars-d
mailing list