dmd foreach loops throw exceptions on invalid UTF sequences, use replacementDchar instead
FeepingCreature
feepingcreature at gmail.com
Fri Nov 5 06:30:02 UTC 2021
On Friday, 5 November 2021 at 00:38:59 UTC, Walter Bright wrote:
> On 11/3/2021 10:41 PM, FeepingCreature wrote:
>> On Thursday, 4 November 2021 at 05:34:29 UTC, FeepingCreature
>> wrote:
>>> One may disagree about autodecoding; I for one think it's a
>>> sensible idea. However, a program should either process data
>>> correctly or, if that is impossible, not at all. It should
>>> not, ever, silently modify it "for you" while reading! I
>>> predict this will lead to cryptic, hair-pulling bugs in user
>>> code involving replacement characters appearing far
>>> downstream of the error site.
>
> Surprisingly, the reverse seems to be true. Suppose you're
> writing a text editor. Then read a file with some bad UTF in
> it. The editor dies with an exception. You can't even edit the
> file to fix it.
>
> If you need to display user provided text, like in a browser,
> or all sorts of tools, you don't want to die with an exception.
> What are you going to do in an exception handler? You're just
> going to replace the offending bytes with ReplacementChar and
> go render it anyway.
>
>> (This is floating point NaN all over again!)
>
> Poor NaNs are terribly misunderstood.
>
> Suppose you have an array of sensors. One goes bad. The "bad"
> value is 0.0. So now your data analyzer is happily averaging
> 0.0 into the results, silently skewing them.
>
> Now, if a NaN is returned instead, your "average" will be NaN.
> You know it's no good. It won't be hidden.
>
> Uninitialized variables are sensors giving bad data. Having a
> NaN in your result is a *good* thing.
I think the program should crash in all these cases. The text
editor should crash. The browser should crash. The analyzer
should see a NaN, and crash.
These programs are *wrong.* They thought they could only get
Unicode and they've gotten non-Unicode. So we know they're
written on wrong assumptions; why do we want to continue running
code we know is untrustworthy? Let them crash, let them be fixed
to make fewer assumptions. Automagically handling errors by
propagating them in an inert form robs the developers and users
of a chance to avoid a mistake. It's no better than 0.0.
More information about the Digitalmars-d
mailing list