On exceptions in D

Sun Feb 9 17:16:09 PST 2014

On Sunday, February 09, 2014 21:57:13 Dmitry Olshansky wrote:
> Split out of "List of Phobos functions that allocate memory?".
> 
> To reiterate, here is some critique, compiled:
> 
> 1. Exceptions are class instances, hence (by default) are allocated on
> GC heap. This is wrong default, GC is no place for temporaries.

The more I think about it, the less I'm convinced that this is a big deal. 
Sure, it would be nice if they were malloc-ed and reference-counted so that 
they'd go away immediately after they were used and didn't risk triggering a 
collection, but for exceptions to work right, they need inheritance, so they 
have to be classes, and since exceptions are generally rare, having the extra 
overhead of the memory allocation usually isn't a big deal, and if it triggers 
a collection or sticks around for a little while after it's used (because no 
collection gets run), most code won't care.

The code that really cares is code that's having to throw exceptions more 
frequently and/or in situations where those exceptions really need to be fast. 
Personally, I've only run into that in unit tests, and I don't think that the 
GC has much do with that slowness (and Adam Ruppe's current work on that seems 
to support that it's not the GC that's an issue). So, if the stack traces (or 
whatever it is that's making them slow) can be fixed to be faster, then as far 
as unit tests go, I would consider the matter taken care of and the fact that 
the GC is used for exceptions to be a non-issue.

The place where this becomes an issue then is code that needs exceptions to be 
really fast (e.g. it sounds like vibe.d falls in that camp). And in that case, 
it doesn't really matter whether the exceptions are allocated on the GC heap 
or malloc's heap. If memory allocation is slowing them down, then they need to 
get rid of the memory allocation entirely, in which case, doing something like 
having a pool of pre-allocated exception objects to reuse would make a lot 
more sense. And in that case, it would probably be better if they weren't on 
the GC heap, but the exception-throwing code wouldn't really care either way. 
That would be up to the pool. The same goes if only a single, static exception 
were used. It might be marginally better if it weren't on the GC heap, because 
it would avoid being scanned, but in those cases where you want speed, you 
_want_ long lifetimes for the exceptions, not short lifetimes like you're 
suggesting, because you want to reuse the exceptions in order to avoid needing 
to allocate new ones. The only way that short lifetimes would work is if we 
weren't dealing with classes and the exceptions were on the stack, but that 
negates our ability to have an exception hierarchy - which is critical to how 
exceptions work.

And if some code is getting exceptions frequently enough that the memory 
allocation is the bottleneck, then maybe exceptions aren't the best choice 
either. I agree that exceptions need to be much, much faster than they are, 
but they're still intended for the error case, which should be relatively 
infrequent.

> 2. Stack trace is constructed on throw. User pays no matter if the trace
> is needed or not. This is in the works, thankfully.

Yes, which should be a significant improvement and likely a much larger gain 
than any memory allocation issues.

> 3. Turns out message is expected to be a string, formatted apriori:
> https://github.com/D-Programming-Language/druntime/blob/master/src/object_.d
> #L1306 Formatting a string in such setting inevitably allocates and it
> happens at the throw site, even if nobody is using that message down the
> line. At least one can override toString...

Ideally, creating the string that toString returns would be put off until 
toString is called (particularly since that includes the stack trace), but I 
would hope that creating the message string to pass to the exception's 
constructor would be cheap enough (particularly in light of the fact that the 
exception is heap-allocated anyway) that it wouldn't be a big deal. So, if we 
can find a way to make this more efficient without getting messy, that's 
great, but I wouldn't expect that to be a bottleneck just so long as the 
actual string that the message gets put into for toString to return (which 
then includes the file and line and stacktrace and whatnot) isn't created 
until toString is called.

- Jonathan M Davis