Handling invalid UTF sequences

Sat Mar 22 14:43:32 PDT 2014

On Friday, 21 March 2014 at 10:39:49 UTC, Denis Shelomovskij 
wrote:
> 21.03.2014 12:25, monarch_dodra пишет:
>> If I remember correctly, with a specially written UTF string, 
>> it *was*
>> possible to corrupt program state. I think. I need to double 
>> check. I
>> didn't give it much thought then ("it should virtually never 
>> happen"),
>> but it could be used as deliberate security vulnerability.
>
> Almost nothing to add here. We already have `-noboundscheck` 
> which can dramatically increase performance, throwing 
> `UTFError` should either use same mechanics (`-noutfcheck`?) or 
> just be stripped in release. Personally I'd choose the latter 
> as there are lots of (sometimes very slow) assertions stripped 
> with `-release` in real programs, which indicates same critical 
> data corruption.

Except it's an Unicode *Exception*. Invalid unicode is *NOT* 
supposed to be an error.

Now I remember: Truncated unicode strings can cause slicing out 
of bounds in popFront.

This means we are currently operating on a double standard of 
sometimes exception, sometimes error, sometimes corruption.