The extent of trust in errors and error handling

Sun Feb 5 23:48:07 PST 2017

On 2/1/2017 11:25 AM, Ali Çehreli wrote:
> 1) There is the well-known issue of whether Error should ever be caught. If
> Error represents conditions where the application is not in a defined state,
> hence it should stop operating as soon as possible, should that also carry over
> to other applications, to the OS, and perhaps even to other systems in the whole
> cluster?

If it is possible for an application to leave other applications or the OS in a 
corrupted state, yes, it should stop the OS as soon as possible. MS-DOS fell 
into this category, it was normal for a crashing program to scramble MS-DOS 
along with it. Attempting to continue running MS-DOS risked scrambling your hard 
disk as well (happened many times to me). I eventually learned to reboot every 
time an app failed unexpectedly. As soon as I could, I moved all development to 
protected mode operating systems, and would port to DOS only as the last step.

> For example, if a function detected an inconsistency in a DB that is available
> to all applications (as is the case in the Unix model of user-based access
> protection), should all processes that use that DB stop operating as well?

A DB inconsistency is not a bug in the application, it is a problem with the 
input to the application. Therefore, it is not an Error, it is an Exception.

Simply put, an Error is a bug in the application. An Exception is a bug in the 
input to the application. The former is not recoverable, the latter is.

> 2) What if an intermediate layer of code did in fact handle an Error (perhaps
> raised by a function pre-condition check)? Should the callers of that layer have
> a say on that? Should a higher level code be able to say that Error should not
> be handled at all?

If the layer has access to the memory space of the caller, an Error in the layer 
is an Error in the caller as well.

> For example, an application code may want to say that no library that it uses
> should handle Errors that are thrown by a security library.

Depends on what you mean by "handling" an Error. If you mean continue running 
the application, you're running a corrupted program. If you mean logging the 
Error and then terminating the application, that would be reasonable.

----

This discussion has come up repeatedly on this forum. Many people strongly 
disagree with me, and believe that they can recover from Errors and continue 
executing the program.

That's fine if the program's output is nothing one cares about, such as a game 
or a music player. If the program's failure could result in the loss of money, 
property, health or lives, it is unacceptable.

Much other confusion comes from not carefully distinguishing Errors from Exceptions.

Corollary: bad input that causes a program to crash is an Error because it is a 
programming bug to fail to vet the input for correctness. For example, if I feed 
a D source file to a C compiler and the C compiler crashes, the C compiler has a 
bug in it, which is an Error. If the C compiler instead writes a message 
"Error: D source code found instead of C source code, please upgrade to a D 
compiler" then that is an Exception.