[Issue 14519] Get rid of unicode validation in string processing
via Digitalmars-d-bugs
digitalmars-d-bugs at puremagic.com
Fri May 20 07:20:03 PDT 2016
https://issues.dlang.org/show_bug.cgi?id=14519
--- Comment #38 from Martin Nowak <code at dawg.eu> ---
(In reply to Vladimir Panteleev from comment #36)
> Question, is there any overhead in actually verifying the validity of UTF-8
> streams, or is all overhead related to error handling (i.e. inability to be
> nothrow)?
I think it's fairly measurable b/c you need to add lots of additional checks
and branches (though highly predictable ones).
While my initial decode implementation
https://github.com/MartinNowak/phobos/blob/1b0edb728c/std/utf.d#L577-L651 was
transmogrify into 200 lines in the meantime
https://github.com/dlang/phobos/blob/acafd848d8/std/utf.d#L1167-L1369, you can
still use it to benchmark validation.
I did run a lot of benchmarks when introducing that function, and the code path
for decoding just remains slow, even with the throwing code path removed out of
normal control flow.
--
More information about the Digitalmars-d-bugs
mailing list