Need to do some "dirty" UTF-8 handling

Nick Sabalausky a at a.a
Sat Jun 25 14:49:13 PDT 2011


"Andrej Mitrovic" <andrej.mitrovich at gmail.com> wrote in message 
news:mailman.1215.1309019944.14074.digitalmars-d-learn at puremagic.com...
> I've had a similar requirement some time ago. I've had to copy and
> modify the phobos function std.utf.decode for a custom text editor
> because the function throws when it finds an invalid code point. This
> is way too slow for my needs. I'm actually displaying invalid code
> points with special marks (just like Scintilla), so I need decoding to
> work as fast as possible.
>
> The new function simply replaces throwing exceptions with flagging a 
> boolean.

I think I may end up doing something like that :/

I was hoping to be able to do something vaguely sensible like this:

string newStr;
foreach(dchar dc; str)
{
    if(isValidDchar(dc))
        newStr ~= dc;
    else
        newStr ~= 'X';
}
str = newStr;

But that just blows up in my face.





More information about the Digitalmars-d-learn mailing list