The Right Approach to Exceptions

Sun Feb 19 17:03:29 PST 2012

On Sat, Feb 18, 2012 at 11:09:23PM -0500, bearophile wrote:
> Sean Cavanaug:
> 
> > In the Von Neumann model this has been made difficult by the stack
> > itself.  Thinking of exceptions as they are currently implemented in
> > Java, C++, D, etc is automatically artificially constraining how
> > they need to work.
> 
> It's interesting to take a look at how "exceptions" are designed in
> Lisp:
> http://www.gigamonkeys.com/book/beyond-exception-handling-conditions-and-restarts.html
[...]

I'm surprised nobody responded to this. I read through the article a
bit, and it does present some interesting concepts that we may be able
to make use of in D. Here's a brief (possibly incomplete) summary:

One problem with the try-throw-catch paradigm is that whenever an
exception is raised, the stack unwinds up some number of levels in the
call stack. By the time it gets to the catch{} block, the context in
which the problem happened is already long-gone, and there is no other
recourse but to abort the operation, or try it again from scratch. There
is no way to recover from the problem by, say, trying to fix it *in the
context in which it happened* and then continuing with the operation.

Say P calls Q, Q calls R, and R calls S. S finds a problem that prevents
it from doing what R expects it to do, so it throws an exception. R
doesn't know what to do, so it propagates the exception to Q. Q doesn't
know what to do either, so it propagates the exception to P. By the time
P gets to know about the problem, the execution context of S is long
gone; the operation that Q was trying to perform has already been
aborted. There's no way to recover except to repeat a potentially very
expensive operation.

The way Lisp handles this is by something called "conditions". I won't
get into the definitions and stuff (just read the article), but the idea
is this:

- When D encounters a problem, it signals a "condition".

   - Along with the condition, it may register 0 or more "restarts",
     basically predefined methods of recovering from the condition.

- The runtime then tries to recover from the condition by:

   - Checking to see if there's a handler registered for this condition.
     If there is, invoke the most recently registered one *in the
     context of the function that triggered the condition*.

   - If there's no handler, unwind the stack and propagate the condition
     to the caller.

- There are two kinds of handlers:

   - The equivalent of a "catch": matches some subset of conditions that
     propagated to that point in the code. Some stack unwinding may
     already have taken place, so these are equivalent to catch block in
     D.

   - Pre-bound handlers: these are registered with the runtime condition
     handler before the condition is triggered (possibly very high up
     the call stack). They are invoked *in the context of the code that
     triggered the condition*. Their primary use is to decide which of
     the restarts associated with the condition should be used to
     recover from it.

The pre-bound handlers are very interesting. They allow in-place
recovery by having high-level callers to decide what to do, *without
unwinding the stack*. Here's an example:

LoadConfig() is a function that loads an application's configuration
files, parses them, and sets up some runtime objects based on
configuration file settings. LoadConfig calls a bunch of functions to
accomplish what it does, among which is ParseConfig(). ParseConfig() in
turn calls ParseConfigItem() for each configuration item in the config
file, to set up the runtime objects associated with that item.
ParseConfigItem() calls DecodeUTF() to convert the configuration file's
text representation from, say, UTF-8 to dchar. So the call stack looks
like this:

LoadConfig
	ParseConfig
		ParseConfigItem
			DecodeUTF

Now suppose the config file has some UTF encoding errors. This causes
DecodeUTF to throw a DecodingError. ParseConfigItem can't go on, since
that configuration item is mangled. So it propagates DecodingError to
ParseConfig.

Now, ParseConfig could simply abort, but using the idea of prebound
handlers, it can actually offer two ways of recovering: (1)
SkipConfigItem, to simply skip the mangled config item and process the
rest of the config file as usual, or (2) ReparseConfigItem, to allow
custom code to manually fix a bad config item and reprocess it.

The problem is, ParseConfig doesn't know which action to take. It's too
low-level to make that sort of decision. You need higher-level code,
that knows what the application needs to do, to decide that. But
ParseConfig can't just propagate the exception to said high-level code,
because if it does, parsing of the entire config file is aborted and
will have to be restarted from scratch.

The solution is to have the higher-level code register a delegate with
the exception system. Something like this:

	// NOTE: not real D code
	void main() {
		registerHandler(auto delegate(ParseError e) {
			if (can_repair_item(e.item)) {
				return e.ReparseConfigItem(
					repairConfigItem(e.item));
			} else {
				return e.SkipConfigItem();
			}
		});

		ParseConfig(configfile);
	}

Now when ParseConfig encounters a problem, it signals a ParseError
object with two options for recovery: ReparseConfigItem and
SkipConfigItem. It doesn't try to fix the problem on its own, but it
lets the delegate from main() make that decision. The runtime exception
system then sees if there's a matching handler, and calls the handler
with the ParseError to determine which course of action to take. If no
handler is found, or the handler decides to abort, then ParseError is
propagated to the caller with stack unwinding.

So ParseConfig might look something like this:

// NOTE: not real D code
auto ParseConfig(...) {
	foreach (item; config_items) {
		try {
			// Note: not real proposed syntax, this is just
			// to show the semantics of the mechanism:
			restart:
			auto objs = ParseConfigItem(item);
			SetupConfigObjects(objs);
		} catch(ParseConfigItemError) {
			// Note: not real proposed syntax, this is just
			// to show the semantics of the mechanism:
			ConfigError e;
			e.ReparseConfigItem = void delegate(ConfigItem
				fixedItem)
			{
				goto restart;
			};
			e.SkipConfigItem = void delegate() {
				continue;
			}

			// This will unwind stack if no handler is
			// found, or handler decides to propagate
			// exception.
			handleError(e);
		}
	}
}

OK, so it looks real ugly right now. But if this mechanism is built into
the language, we could have much better syntax, something like this:

auto ParseConfig(...) {
	foreach (item; config_items) {
		try {
			auto objs = ParseConfigItem(item);
			SetupConfigObjects(objs);
		} recoverBy ReparseConfigItem(fixedItem) {
			item = fixedItem;
			restart;	// restarts try{} block
		} recoverBy SkipConfigItem() {
			setDefaultConfigObjs();
			continue;	// continues foreach loop
		}
	}
}

This is just a rough sketch syntax, just to show the idea. It can of
course be improved upon.

T

-- 
Nobody is perfect.  I am Nobody. -- pepoluan, GKC forum