string need to be robust
Michel Fortin
michel.fortin at michelf.com
Sun Mar 13 07:57:07 PDT 2011
On 2011-03-13 10:18:24 -0400, ZY Zhou <rinick at GeeeMail.com> said:
> What if I'm making a text editor with D?
> I know the text has something wrong, I want to open it and fix it. the
> exception
> won't help, if the editor just refuse to open invalid file, then the editor is
> useless.
> Try open an invalid utf file with a text editor, like vim, you will understand
> what I mean
But what is the best thing to do when you got an invalid UTF file in a
text editor? Perhaps you should show a warning to the user, perhaps you
also should ask the user to select the right text encoding (because it
might simply not be UTF-8), or perhaps you want to silently ignore the
error and show an invalid character marker at the right point in the
text. All of these options are valid and the programing language
shouldn't decide that for you.
So I'd point out that a text file editor is a special use case, most
programs aren't text file editors and don't share this concern. In the
same vein, HTML parsers are also a special case that should know how to
handle encodings. In fact, HTML 5 defines explicitly how to deal with
invalid UTF-8 sequences:
<http://www.whatwg.org/specs/web-apps/current-work/multipage/infrastructure.html#utf-8>
There
are many good ways to deal with invalid UTF-8 sequences. Throwing an
exception seems like the most robust one to me since it protects
against invalid input. What to do with invalid input belongs in the
application logic, not the language.
--
Michel Fortin
michel.fortin at michelf.com
http://michelf.com/
More information about the Digitalmars-d
mailing list