dmd foreach loops throw exceptions on invalid UTF sequences, use replacementDchar instead

Walter Bright newshound2 at digitalmars.com
Fri Nov 5 00:38:59 UTC 2021


On 11/3/2021 10:41 PM, FeepingCreature wrote:
> On Thursday, 4 November 2021 at 05:34:29 UTC, FeepingCreature wrote:
>> One may disagree about autodecoding; I for one think it's a sensible idea. 
>> However, a program should either process data correctly or, if that is 
>> impossible, not at all. It should not, ever, silently modify it "for you" 
>> while reading! I predict this will lead to cryptic, hair-pulling bugs in user 
>> code involving replacement characters appearing far downstream of the error site.

Surprisingly, the reverse seems to be true. Suppose you're writing a text 
editor. Then read a file with some bad UTF in it. The editor dies with an 
exception. You can't even edit the file to fix it.

If you need to display user provided text, like in a browser, or all sorts of 
tools, you don't want to die with an exception. What are you going to do in an 
exception handler? You're just going to replace the offending bytes with 
ReplacementChar and go render it anyway.

> (This is floating point NaN all over again!)

Poor NaNs are terribly misunderstood.

Suppose you have an array of sensors. One goes bad. The "bad" value is 0.0. So 
now your data analyzer is happily averaging 0.0 into the results, silently 
skewing them.

Now, if a NaN is returned instead, your "average" will be NaN. You know it's no 
good. It won't be hidden.

Uninitialized variables are sensors giving bad data. Having a NaN in your result 
is a *good* thing.



More information about the Digitalmars-d mailing list