DIP76: Autodecode Should Not Throw

Vladimir Panteleev via Digitalmars-d digitalmars-d at puremagic.com
Tue Apr 7 02:10:32 PDT 2015


On Tuesday, 7 April 2015 at 09:04:09 UTC, Walter Bright wrote:
> On 4/7/2015 1:19 AM, Dicebot wrote:
>> I have doubts about it similar to Vladimir. Main problem is 
>> that I have no idea
>> what actually happens if replacement characters appear in some 
>> unicode text my
>> program processes.
>
> It's much like floating point NaN values, which are 'sticky'.

Yes, but std.conv doesn't return NaN if you try to convert 
"banana" to a double.

> With UTF strings, if you care about invalid UTF (a surprisingly 
> large amount of operations done on strings simply don't care 
> about invalid UTF) the validation can be done as a separate 
> step.

So can converting invalid UTF to replacement characters.

>> Also it is worrying to see so much effort put into `nothrow` 
>> in language which
>> endorses exceptions as its main error reporting mechanism.
>
> There is definitely a tug of war going on there. Exceptions are 
> great, except they aren't free.
>
> What I've tried to do is design things so that erroneous input 
> is not possible - that all possible input has straightforward 
> output. In other words, try to define the problem out of 
> existence. Then there are no errors.

I think the correct solution to that is to kill auto-decoding :) 
Then all decoding is explicit, and since it is explicit, it is 
trivial to allow specifying the desired behavior upon 
encountering invalid UTF-8.


More information about the Digitalmars-d mailing list