The extent of trust in errors and error handling

Fri Feb 3 23:24:12 PST 2017

On 02/01/2017 06:29 PM, Chris Wright wrote:
 > On Wed, 01 Feb 2017 11:25:07 -0800, Ali Çehreli wrote:
 >> 1) There is the well-known issue of whether Error should ever be caught.
 >> If Error represents conditions where the application is not in a defined
 >> state, hence it should stop operating as soon as possible, should that
 >> also carry over to other applications, to the OS, and perhaps even to
 >> other systems in the whole cluster?
 >
 > My programs tend to apply operations to a queue of data. It might be a
 > queue over time, like incoming requests, or it might be a queue based on
 > something else, like URLs that I extract from HTML documents.
 >
 > Anything that does not impact my ability to manipulate the queue can be
 > safely caught and recovered from.
 >
 > Stack overflow? Be my guest.
 >
 > Null pointer? It's a bug, but it's probably specific to a small 
subset of
 > queue items -- log it, put it in the dead letter queue, move on.
 >
 > RangeError? Again, a bug, but I can successfully process everything else.

In practice, both null pointer and range error can probably be dealt 
with and the program can move forward.

However, in theory you cannot be sure why that pointer is null or why 
that index is out of range. It's possible that something horrible 
happened many clock cycles ago and you're seeing the side effects of 
that thing now.

What operations can you safely assume that you can still perform? Can 
you log? Are you sure? Even if you caught RangeError, are you sure that 
arr.ptr is still sane? etc.

In theory, at least the way I understand it, a program lives on a very 
narrow path. Once it steps outside that well known path, all bets are 
off. Can a caught Error bring it back on the path or are we on an 
alternate path now.

 >> 2) What if an intermediate layer of code did in fact handle an Error
 >> (perhaps raised by a function pre-condition check)? Should the callers
 >> of that layer have a say on that? Should a higher level code be able to
 >> say that Error should not be handled at all?
 >>
 >> For example, an application code may want to say that no library that it
 >> uses should handle Errors that are thrown by a security library.
 >
 > There's a bit of a wrinkle there. "Handling" an error might include
 > catching it, adding some extra data, and then rethrowing.

Interestingly, attempting to add extra data can very well produce the 
opposite effect: Stack trace information that would potentially be 
available can indeed be corrupted while adding that extra data.

The interesting part is trust. Once there is an Error, what can you trust?

 >> I think there is no way of
 >> requiring that e.g. a square root function not have side effects at all:
 >> The compiler can allow a piece of code but then the library that was
 >> actually linked with the application can do anything else that it wants.
 >
 > You can write a compiler with its own object format and linker, which 
lets
 > you verify these promises at link time.

Good idea. :) As Joakim reminded, the designers of Midori did that and more.

 > As an aside on this topic, I might recommend looking at Vigil, the
 > eternally morally vigilant programming language:
 > https://github.com/munificent/vigil
 >
 > It has a rather effective way of dealing with errors that aren't
 > explicitly handled.
 >

Thank you, I will look at it next.

Ali