Null references redux

Tue Sep 29 12:01:56 PDT 2009

Sean Kelly wrote:
> == Quote from Jeremie Pelletier (jeremiep at gmail.com)'s article
>> Andrei Alexandrescu wrote:
>>> Jeremie Pelletier wrote:
>>>>> Is this Linux specific? what about other *nix systems, like BSD and
>>>>> solaris?
>>>> Signal handler are standard to most *nix platforms since they're part
>>>> of the posix C standard libraries, maybe some platforms will require a
>>>> special handling but nothing impossible to do.
>>> Let me write a message on behalf of Sean Kelly. He wrote that to Walter
>>> and myself this morning, then I suggested him to post it but probably he
>>> is off email for a short while. Hopefully the community will find a
>>> solution to the issue he's raising. Let me post this:
>>>
>>> ===================
>>> Sean Kelly wrote:
>>>
>>> There's one minor problem with his code.  It's not safe to throw an
>>> exception from a signal handler.  Here's a quote from the POSIX spec at
>>> opengroup.org:
>>>
>>> "In order to prevent errors arising from interrupting non-reentrant
>>> function calls, applications should protect calls to these functions
>>> either by blocking the appropriate signals or through the use of some
>>> programmatic semaphore (see semget() , sem_init() , sem_open() , and so
>>> on). Note in particular that even the "safe" functions may modify errno;
>>> the signal-catching function, if not executing as an independent thread,
>>> may want to save and restore its value. Naturally, the same principles
>>> apply to the reentrancy of application routines and asynchronous data
>>> access. Note thatlongjmp() and siglongjmp() are not in the list of
>>> reentrant functions. This is because the code executing after longjmp()
>>> and siglongjmp() can call any unsafe functions with the same danger as
>>> calling those unsafe functions directly from the signal handler.
>>> Applications that use longjmp() andsiglongjmp() from within signal
>>> handlers require rigorous protection in order to be portable."
>>>
>>> If this were an acceptable approach it would have been in druntime ages
>>> ago :-)
>>> ===================
>> Yes but the segfault signal handler is not made to design code that can
>> live with these exceptions, its just a feature to allow segfaults to be
>> sent to the crash handler to get a backtrace dump. Even on windows while
>> you can recover from access violations, its generally a bad idea to
>> allow for bugs to be turned into features.
> 
> I don't think it's fair to compare Windows to Unix here because, as far as
> I know, Windows (ie. Win32, etc) was built with exceptions in mind (thanks to
> SEH), while Unix was not.  So while the Windows kernel may theoretically be fine
> with an exception being thrown from within kernel code, this isn't true of Unix.
> 
> It's true that as long as only Errors are thrown (and thus that the app intends
> to terminate), things aren't as bad as they could be.  Worst case, some mutex
> in libc is left locked or in some weird state and code executed during stack
> unwinding or when trying to report the error causes the app to hang instead
> of terminate.  And this risk is somewhat mitigated because I'd expect most
> of these errors to occur within user code anyway.
> 
> One thing I'm not entirely sure about is whether the signal handler will always
> have a valid, C-style call stack tracing back into user code.  These errors are
> triggered by hardware, and I really don't know what kind of tricks are common
> at that level of OS code.  longjmp() doesn't have this problem because it doesn't
> care about the call stack--it just swaps some registers and executes a JMP.  I
> don't suppose anyone here knows more about the feasibility of throwing
> exceptions from signal handlers at all?  I'll ask around some OS groups and
> see what people say.

I haven't had any problems so far, the stack trace generated was always 
valid and similar to what gdb would output. But I agree that trying to 
recover from these exceptions is a *bad* idea in so many ways.

 From what I know, the kernel alters the stack frame of the signal 
handler to make us believe we called it ourselves. Returning from the 
signal handler therefore jumps to the routine from which the signal was 
originally raised, without the kernel being aware of it.

This is a bit different than how SEH is handled, but has a lot in common 
to it:

 From the research I did about SEH internals, its just built on top of 
interrupt handlers. The hardware raises an exception (access violation, 
etc), jumps into a kernel handler for the corresponding interrupt, it 
there looks up the base of the stack for a pointer to a struct 
containing a handler function and a handler table which is set and 
restored by try blocks and calls the exception handler (_d_framehandler 
in our case) with the appropriate parameters. From there the kernel 
decides what to do based on the return code of the framehandler.

The signal handler model is therefore quite acceptable to build 
exception handling on top of. We just may want to also manually generate 
a core dump before throwing the exception to support postmortem debugging.