The Right Approach to Exceptions

Mon Feb 20 01:32:57 PST 2012

On 2012-02-20 02:03, H. S. Teoh wrote:
> On Sat, Feb 18, 2012 at 11:09:23PM -0500, bearophile wrote:
>> Sean Cavanaug:
>>
>>> In the Von Neumann model this has been made difficult by the stack
>>> itself.  Thinking of exceptions as they are currently implemented in
>>> Java, C++, D, etc is automatically artificially constraining how
>>> they need to work.
>>
>> It's interesting to take a look at how "exceptions" are designed in
>> Lisp:
>> http://www.gigamonkeys.com/book/beyond-exception-handling-conditions-and-restarts.html
> [...]
>
> I'm surprised nobody responded to this. I read through the article a
> bit, and it does present some interesting concepts that we may be able
> to make use of in D. Here's a brief (possibly incomplete) summary:
>
> One problem with the try-throw-catch paradigm is that whenever an
> exception is raised, the stack unwinds up some number of levels in the
> call stack. By the time it gets to the catch{} block, the context in
> which the problem happened is already long-gone, and there is no other
> recourse but to abort the operation, or try it again from scratch. There
> is no way to recover from the problem by, say, trying to fix it *in the
> context in which it happened* and then continuing with the operation.
>
> Say P calls Q, Q calls R, and R calls S. S finds a problem that prevents
> it from doing what R expects it to do, so it throws an exception. R
> doesn't know what to do, so it propagates the exception to Q. Q doesn't
> know what to do either, so it propagates the exception to P. By the time
> P gets to know about the problem, the execution context of S is long
> gone; the operation that Q was trying to perform has already been
> aborted. There's no way to recover except to repeat a potentially very
> expensive operation.
>
> The way Lisp handles this is by something called "conditions". I won't
> get into the definitions and stuff (just read the article), but the idea
> is this:
>
> - When D encounters a problem, it signals a "condition".
>
>     - Along with the condition, it may register 0 or more "restarts",
>       basically predefined methods of recovering from the condition.
>
> - The runtime then tries to recover from the condition by:
>
>     - Checking to see if there's a handler registered for this condition.
>       If there is, invoke the most recently registered one *in the
>       context of the function that triggered the condition*.
>
>     - If there's no handler, unwind the stack and propagate the condition
>       to the caller.
>
> - There are two kinds of handlers:
>
>     - The equivalent of a "catch": matches some subset of conditions that
>       propagated to that point in the code. Some stack unwinding may
>       already have taken place, so these are equivalent to catch block in
>       D.
>
>     - Pre-bound handlers: these are registered with the runtime condition
>       handler before the condition is triggered (possibly very high up
>       the call stack). They are invoked *in the context of the code that
>       triggered the condition*. Their primary use is to decide which of
>       the restarts associated with the condition should be used to
>       recover from it.
>
> The pre-bound handlers are very interesting. They allow in-place
> recovery by having high-level callers to decide what to do, *without
> unwinding the stack*. Here's an example:
>
> LoadConfig() is a function that loads an application's configuration
> files, parses them, and sets up some runtime objects based on
> configuration file settings. LoadConfig calls a bunch of functions to
> accomplish what it does, among which is ParseConfig(). ParseConfig() in
> turn calls ParseConfigItem() for each configuration item in the config
> file, to set up the runtime objects associated with that item.
> ParseConfigItem() calls DecodeUTF() to convert the configuration file's
> text representation from, say, UTF-8 to dchar. So the call stack looks
> like this:
>
> LoadConfig
> 	ParseConfig
> 		ParseConfigItem
> 			DecodeUTF
>
> Now suppose the config file has some UTF encoding errors. This causes
> DecodeUTF to throw a DecodingError. ParseConfigItem can't go on, since
> that configuration item is mangled. So it propagates DecodingError to
> ParseConfig.
>
> Now, ParseConfig could simply abort, but using the idea of prebound
> handlers, it can actually offer two ways of recovering: (1)
> SkipConfigItem, to simply skip the mangled config item and process the
> rest of the config file as usual, or (2) ReparseConfigItem, to allow
> custom code to manually fix a bad config item and reprocess it.
>
> The problem is, ParseConfig doesn't know which action to take. It's too
> low-level to make that sort of decision. You need higher-level code,
> that knows what the application needs to do, to decide that. But
> ParseConfig can't just propagate the exception to said high-level code,
> because if it does, parsing of the entire config file is aborted and
> will have to be restarted from scratch.
>
> The solution is to have the higher-level code register a delegate with
> the exception system. Something like this:
>
> 	// NOTE: not real D code
> 	void main() {
> 		registerHandler(auto delegate(ParseError e) {
> 			if (can_repair_item(e.item)) {
> 				return e.ReparseConfigItem(
> 					repairConfigItem(e.item));
> 			} else {
> 				return e.SkipConfigItem();
> 			}
> 		});
>
> 		ParseConfig(configfile);
> 	}
>
> Now when ParseConfig encounters a problem, it signals a ParseError
> object with two options for recovery: ReparseConfigItem and
> SkipConfigItem. It doesn't try to fix the problem on its own, but it
> lets the delegate from main() make that decision. The runtime exception
> system then sees if there's a matching handler, and calls the handler
> with the ParseError to determine which course of action to take. If no
> handler is found, or the handler decides to abort, then ParseError is
> propagated to the caller with stack unwinding.
>
> So ParseConfig might look something like this:
>
> // NOTE: not real D code
> auto ParseConfig(...) {
> 	foreach (item; config_items) {
> 		try {
> 			// Note: not real proposed syntax, this is just
> 			// to show the semantics of the mechanism:
> 			restart:
> 			auto objs = ParseConfigItem(item);
> 			SetupConfigObjects(objs);
> 		} catch(ParseConfigItemError) {
> 			// Note: not real proposed syntax, this is just
> 			// to show the semantics of the mechanism:
> 			ConfigError e;
> 			e.ReparseConfigItem = void delegate(ConfigItem
> 				fixedItem)
> 			{
> 				goto restart;
> 			};
> 			e.SkipConfigItem = void delegate() {
> 				continue;
> 			}
>
> 			// This will unwind stack if no handler is
> 			// found, or handler decides to propagate
> 			// exception.
> 			handleError(e);
> 		}
> 	}
> }
>
> OK, so it looks real ugly right now. But if this mechanism is built into
> the language, we could have much better syntax, something like this:
>
> auto ParseConfig(...) {
> 	foreach (item; config_items) {
> 		try {
> 			auto objs = ParseConfigItem(item);
> 			SetupConfigObjects(objs);
> 		} recoverBy ReparseConfigItem(fixedItem) {
> 			item = fixedItem;
> 			restart;	// restarts try{} block
> 		} recoverBy SkipConfigItem() {
> 			setDefaultConfigObjs();
> 			continue;	// continues foreach loop
> 		}
> 	}
> }
>
> This is just a rough sketch syntax, just to show the idea. It can of
> course be improved upon.
>
>
> T

I was actually thinking something similar, the part about registering 
exception handlers, i.e. using "registerHandler".

-- 
/Jacob Carlborg