DIP76: Autodecode Should Not Throw

Walter Bright via Digitalmars-d digitalmars-d at puremagic.com
Tue Apr 7 02:21:50 PDT 2015


On 4/7/2015 2:10 AM, Vladimir Panteleev wrote:
> On Tuesday, 7 April 2015 at 09:04:09 UTC, Walter Bright wrote:
>> On 4/7/2015 1:19 AM, Dicebot wrote:
>>> I have doubts about it similar to Vladimir. Main problem is that I have no idea
>>> what actually happens if replacement characters appear in some unicode text my
>>> program processes.
>>
>> It's much like floating point NaN values, which are 'sticky'.
>
> Yes, but std.conv doesn't return NaN if you try to convert "banana" to a double.

Maybe it should :-)


>> With UTF strings, if you care about invalid UTF (a surprisingly large amount
>> of operations done on strings simply don't care about invalid UTF) the
>> validation can be done as a separate step.
>
> So can converting invalid UTF to replacement characters.

I know, I read your post. The machinery to allocate, throw, catch, and replace 
is still there.


> I think the correct solution to that is to kill auto-decoding :) Then all
> decoding is explicit, and since it is explicit, it is trivial to allow
> specifying the desired behavior upon encountering invalid UTF-8.

I agree autodecoding is a mistake, but we're stuck with it.


More information about the Digitalmars-d mailing list