Program logic bugs vs input/environmental errors

Sun Sep 28 15:51:40 PDT 2014

On 28/09/14 22:13, Walter Bright via Digitalmars-d wrote:
> On 9/28/2014 12:33 PM, Sean Kelly wrote:
>>> Then use assert(). That's just what it's for.
>> What if I don't want to be forced to abort the program in the event of such an
>> error?
>
> Then we are back to the discussion about can a program continue after a logic
> error is uncovered, or not.
>
> In any program, the programmer must decide if an error is a bug or not, before
> shipping it. Trying to avoid making this decision leads to confusion and using
> the wrong techniques to deal with it.
>
> A program bug is, by definition, unknown and unanticipated. The idea that one
> can "recover" from it is fundamentally wrong. Of course, in D one can try and
> recover from them anyway, but you're on your own trying that, just as you're on
> your own when casting integers to pointers.

Allowing for your "you can try ..." remarks, I still feel this doesn't really 
cover the practical realities of how some applications need to behave.

Put it this way: suppose we're writing the software for a telephone exchange, 
which is handling thousands of simultaneous calls.  If an Error is thrown inside 
the part of the code handling one single call, is it correct to bring down 
everyone else's call too?

I appreciate that you might tell me "You need to find a different means of error 
handling that can distinguish errors that are recoverable", but the bottom line 
is, in such a scenario it's not possible to completely rule out an Error being 
thrown (an obvious cause would be an assert that gets triggered because the 
programmer forgot to put a corresponding enforce() statement at a higher level 
in the code).

However, it's clearly very desirable in this use-case for the application to 
keep going if at all possible and for any problem, even an Error, to be 
contained in its local context if we can do so.  (By "local context", in 
practice this probably means a thread or fiber or some other similar programming 
construct.)

Sean's touched on this in the current thread with his reference to Erlang, and I 
remember that he and Dicebot brought the issue up in an earlier discussion on 
the Error vs. Exception question, but I don't recall that discussion having any 
firm conclusion, and I think it's important to address; we can't simply take "An 
Error is unrecoverable" as a point of principle for every application.

(Related note: If I recall right, an Error or uncaught Exception thrown within a 
thread or fiber will not actually bring the application down, only cause that 
thread/fiber to hang, without printing any indication of anything going wrong. 
So on a purely practical basis, it can be essential for the top-level code of a 
thread or fiber to have a catch {} block for both Errors and Exceptions, just in 
order to be able to report what has happened effectively.)