Why isn't skipOver(string, string) nothrow?

Tue Oct 22 19:52:51 UTC 2019

On Tuesday, October 22, 2019 9:33:05 AM MDT Per Nordlöw via Digitalmars-d-
learn wrote:
> Why isn't a call to
>
>      skipOver(string, string)
>
> nothrow?
>
> I see no reason why it shouldn't be.
>
> Further, this test should be qualifyable as nothrow:
>
> @safe pure /* TODO nothrow @nogc */ unittest
> {
>      import std.algorithm.searching : skipOver;
>      auto x = "beta version";
>      assert(x.skipOver("beta"));
>      assert(x == " version");
> }

Almost anything involving strings isn't going to be nothrow, because front
and popFront throw on invalid UTF. To really fix that, we'd need to get rid
of auto-decoding. You can use std.utf.byDchar to wrap the string in a range
of dchar which replaces invalid Unicode with the replacement character
instead, which means that no exception gets thrown, but it also means that
if you hadn't previously validated the Unicode, you could end up processing
invalid Unicode without realizing it. How much that matters depends on what
you're doing. Ideally, all strings would just be validated when they were
created, and then it wouldn't be an issue, but any code that decodes the
code points would still have to deal with invalid Unicode in some manner
(though if we decided that it was the responsibility of the caller to always
validate the Unicode first, then we could use assertions). For better or
worse, the chosen solution when ranges were first put together was to throw
on invalid Unicode, which basically makes it impossible for functions that
process strings to be nothrow unless they go to the extra effort working
around auto-decoding. If we're ever able to remove auto-decoding, then
that's no longer an issue for all string processing functions, but it's
still going to be an issue for any code that calls functions like decode or
stride. They're either going to throw or replace invalid Unicode with the
replacement character. Which approach is better depends on the code.

In any case, as long as auto-decoding is a thing, you'll have to use
wrappers like byDchar or byCodeUnit if you want much of anything involving
strings to be nothrow.

- Jonathan M Davis