string need to be robust

Jonathan M Davis jmdavisProg at gmx.com
Sun Mar 13 23:27:44 PDT 2011


On Sunday 13 March 2011 22:45:38 ZY Zhou wrote:
> it doesn't make sense to add try/catch every time you use
> tolower/toupper/foreach on string. No one will do that.
> You either throw exception when convert invalid utf8 bytes to string, or
> never throw exception and use invalid UTF32 code in dchar to represent
> invalid utf8 code.
> 
>   string s = "\x0A"; // this is the right place to throw the exception (or
> compile error)
>   s.tolower; // no one will add try/catch on this

If you're going to worry about string validity, it should be checked when the 
string is initially created. If it's not valid, then fix it in whatever way you 
deem appropriate. After that, you shouldn't have to worry about string validity 
anymore. Honestly, invalid UTF-8 is pretty rare overall. You'll get it because 
of a bad file or somesuch, but once the string is valid, it stays valid. So, you 
really only have to worry about it when you read in files and such. Once the file 
has been correctly read in, you just use the strings without worrying about it. 
There shouldn't be a need to use try-catch blocks much of anywhere to worry 
about invalid unicode.

- Jonathan M Davis


More information about the Digitalmars-d mailing list