What is the point of nothrow?

Wed Jun 13 00:38:55 UTC 2018

On Tuesday, June 12, 2018 23:32:55 Neia Neutuladh via Digitalmars-d-learn 
wrote:
> On Monday, 11 June 2018 at 00:47:27 UTC, Jonathan M Davis wrote:
> > Why do you care about detecting code that can throw an Error?
> > Errors are supposed to kill the program, not get caught. As
> > such, why does it matter if it can throw an Error?
>
> Error is currently used for three different things:
> * This is a problem that could occur in such a wide range of
> circumstances, it would make it difficult to use nothrow.

This is not a valid reason to use Error. Error is specifically for cases
where failure is a bug in the program or where the program cannot recover
from the failure and must be terminated. If a program is simply trying to be
able to use nothrow, then it needs to use an error-hanlding mechanism other
than exceptions. Not only is this how Errors are designed to work, but the
fact that proper clean-up is not guaranteed when a non-Exception Throwable
is thrown means that attempting to continue after anything other than an
Exception is thrown is incredibly risky, potentially putting your program in
an invalid state and causing who knows what bugs. And nothrow functions are
a prime case where clean-up is definitely not done for non-Exceptions,
because avoiding the extra code necessary to do that clean-up is one of the
main reasons that nothrow exists in the first place.

> * This is a problem severe enough that almost every program would
> have to abort in these circumstances, so it's reasonable to abort
> every program here, and damn the few that could handle this type
> of problem.
> * This is a problem that someone thinks you might not want to
> catch when you write `catch (Exception)`, even if it can't be
> thrown from many places and it wouldn't kill most programs.
>
> As an example of the former: I have a service that uses
> length-prefixed messages on raw sockets. Someone tries to connect
> to this service with curl. The length of the message is read as
> 0x4854_5450_2131_2E31 -- ASCII "HTTP/1.1" as an unsigned long.
>
> (Or we read a 32-bit length, but we're running on a system with
> 128MB of RAM and overcommit turned off.)
>
> The program might be in an invalid state if this allocation
> fails. It might not. This depends entirely on how it was written.
> The runtime is in a valid state. But the exception is
> OutOfRangeError, which inherits from Error.

It's possible to write programs that check and handle running out of memory,
but most programs don't, and usually, if a program runs out of memory, it
can't do anything about it and can't function properly at that point. As
such, D's new was designed with the idea that failed memory allocations are
fatal to the program, and any program that wants to be able to handle the
case where it runs out of memory but somehow is able to continue to function
shouldn't be using the GC for such allocations.

But programs that can even attempt to recover from running out of memory are
going to be rare, and having running out of memory throw an Exception would
likely cause all kinds of fun problems in the typical case, since if
anything catches the Exception, that could easily trigger a chain reaction
of nasty stuff. The catch almost certainly wouldn't be properly attempting
to recover from running out of memory, and the program would almost
certainly assume that allocations always succeeded rather than exiting on
allocation failure. So, continuing at that point would effectively put the
program in an invalid state. Also, if simply allocating memory could throw
and Exception, then that would pretty much kill nothrow, since it would only
be viable in @nogc code.

So, while treating all failed memory allocations as fatal is certainly a
debtable choice, it does fit what most programs do quite well. But either
way, the result is that anyone programming in D who might want to recover
from memory allocation failures needs to take that design into account and
really should be avoiding the GC for such allocations.

> Similarly, RangeError. There's little conceptual difference
> between `try {} catch (RangeError) break` and `if (i >= length)
> break`. But forbidding dynamic array indexing in nothrow code
> would be rather extreme.

The idea is that it's a bug in your code if you ever index an array with an
index that's out-of-bounds. If there's any risk of indexing incorrectly,
then the program needs to check for it, or it's a bug in the program. Most
indices are not taken from program input, so treating them as input in the
general case wouldn't really make sense - plus, of course, treating them as
program input in the general case would mean using Exceptions, which would
then kill nothrow. In the end, it just makes more sense to treat invalid
indices as programming errors. So, in the cases where an index is actually
derived from program input, the program must check the index, or it's a bug,
and the result will be an Error being thrown.

> On the other hand, a Unicode decoding error is a
> UnicodeException, not a UnicodeError. I guess whoever wrote that
> thought invalid Unicode data was sufficiently more common than
> invalid values in length-prefixed data formats to produce a
> difference in kind. This isn't obviously wrong, but it does look
> like something that could use justification.

The difference is that incorrectly indexing an array is considered a bug in
your program, whereas bad Unicode is almost always bad program input. Bad
input to a program is not a bug in the program. Assuming that the input is
valid and treating it that way when it might be invalid would be a bug in
the program, but code that validates program input is not buggy because it
determines that the input is bad. As such, throwing an Error on bad Unicode
doesn't make much sense. The only way that it would make sense to treat
invalid Unicode as a bug in the program would be if it were reasonable to
assume that all Unicode was validated before ever being passed to
std.utf.decode or std.utf.stride.

- Jonathan M Davis