[Issue 14519] [Enh] foreach on strings should return replacementDchar rather than throwing

via Digitalmars-d-bugs digitalmars-d-bugs at puremagic.com
Wed Apr 29 00:36:51 PDT 2015


https://issues.dlang.org/show_bug.cgi?id=14519

--- Comment #6 from Jonathan M Davis <issues.dlang at jmdavisProg.com> ---
(In reply to Vladimir Panteleev from comment #4)
> Here's a counter-proposal: when encountering invalid UTF-8, instead of
> throwing exceptions, throw errors. This will fix the nothrow and performance
> problems, and will avoid the risk of data corruption.


Yikes. That is far worse than throwing Exceptions, since it would kill your
program, and it's indicative of a bug in the program rather than bad input.

> The workaround is to
> pre-sanitize the input. The impact of breaking existing code is the same as
> the original proposal.

Pre-sanitizing input is exactly what should be done if you care about unicode
validation. You validate any strings entering the program from a file, a
socket, or from user input, and then you know that you're operating on valid
Unicode. But most programs just don't care about how valid the Unicode is, and
the fact that throwing is how it's handled is incredibly annoying. It forces
validation on all programs whether they need it or not, and it makes it so that
string-based code can pretty much never be nothrow. Using the replacement
character in the stead of invalid unicode is exactly what it was created for in
the first place.

--


More information about the Digitalmars-d-bugs mailing list