[Issue 12923] UTF exception in stride even though passes validate.

via Digitalmars-d-bugs digitalmars-d-bugs at puremagic.com
Sat Jun 14 14:57:02 PDT 2014


https://issues.dlang.org/show_bug.cgi?id=12923

--- Comment #2 from Timothee Cour <timothee.cour2 at gmail.com> ---
(In reply to Timothee Cour from comment #1)
> (In reply to Timothee Cour from comment #0)
> > import std.utf;
> > void main(){
> >   char[3]a=[167, 133, 175];
> >   validate(a);
> >   //passes
> > 
> >   auto k=stride(a,0);
> >   /+
> >   std.utf.UTFException at std/utf.d(199): Invalid UTF-8 sequence (at index 0)
> >   pure @safe uint std.utf.stride!(char[3]).stride(ref char[3], ulong) + 141
> >   +/
> > }
> > 
> > This happens even after applying the fix
> > https://github.com/D-Programming-Language/phobos/pull/2038
> 
> Additionally, another error is thrown on any of those:
> foreach (i, dchar c; a){} //src/rt/util/utf.d:290 Invalid UTF-8 sequence
> foreach_reverse (i, dchar c; a){} //src/rt/aApplyR.d:511 Invalid UTF-8
> sequence
> 
> so perhaps std.utf.validate accepts some invalid UTF sequences


Here's one possible fix:

in decodeImpl:
----
UTFException invalidUTF(){...}

//insert this
import core.bitop;
immutable msbs = 7 - bsr(~fst);
if (msbs < 2 || msbs > 6) throw invalidUTF();

UTFException outOfBounds() {...}
----

To have same behavior as inside strideImpl.
But is that correct, or was the behavior in strideImpl wrong itself?

--


More information about the Digitalmars-d-bugs mailing list