Handling invalid UTF sequences

Thu Mar 20 16:27:22 PDT 2014

On 3/20/2014 3:51 PM, monarch_dodra wrote:
> In any case, both proposals would be major breaking changes...

Or we could do this as alternate names, leaving the originals as throwing.

 > Silently accepting invalid sequences sounds nice at first, but its kind of 
just squelching the problem, isn't it?

Not exactly. The decoded/encoded string will still have invalid code units in 
it. It'd be like floating point nan, the invalid bits will still be propagated 
onwards to the output.

I'm also of the belief that UTF sequences should be validated on input, not 
necessarily on every operation on them.