Odd behavior in fiber unittest - printf stops optimization crash

Kai Nacke via digitalmars-d-ldc digitalmars-d-ldc at puremagic.com
Sun Mar 1 05:35:52 PST 2015


On Sunday, 1 March 2015 at 00:30:03 UTC, Kevin Brogan wrote:
> msvc 64 bit build.
>
> Trying to figure out why the unit test for core.thread is 
> failing for release but not for debug.
>
> First I wanted to figure out which tests were causing the 
> crash. Turns out that only the fiber related tests are 
> crashing. All thread related unit tests are fine.
>
> Thoughts are that it would be an optimization problem, since 
> release is crashing but not debug.
>
> I switch out the -O3 flag in build scripts for core.thread with 
> -02. No more crash.
>
> So that's confirmed...  but what line of code is causing the 
> crash? Switch back to -O3 and disable all of the fibre unit 
> tests except for what looked to be the simplest one.
>
> unittest
> {
>     int x = 0;
>
>     (new Fiber({
>         x++;
>     })).call();
>     assert( x == 1 );
> }
>
> Using printf statements I determine that the crash is happening 
> inside of a naked assembly function called fiber_switchContext.
>
> There's almost zero comments so I don't understand exactly why 
> it's doing what it's doing. Seems to be saving the win64 abi 
> non-volatile registers as well as something to do with thread 
> local storage using the GS register which is completely 
> undocumented on msdn from what I can tell.
>
> I have questions about the function. Why is RBX pushed after 
> the XMM registers. (xmm alignment i guess?) Why are RDI and RSI 
> not saved, as they are non-volatile as well according to 
> https://msdn.microsoft.com/en-us/library/9z1stfyw.aspx.
>
> So while I'm writing this I realize I haven't tested what 
> happens if I modify the context switch to save RDI and RSI as 
> well...  unit test is working now!!! next!
>
> Earlier I had figured out that placing a printf on either side 
> of the call to fiber_switchContext would eliminate the crash. I 
> guess it changed the optimized code...
>
> On to the next failing fiber related unit test!

Hi Kevin!

fiber_switchContext has to parameters: the stack pointer of the 
old fiber (oldp) and the stack pointer of the new fiber (newp).
In essence, the state of the old fiber is pushed on the stack 
pointed to by oldp. Then the state of the new fiber is loaded 
from the stack pointed to by newp.
The context switch is finished with a JMP to the new fiber code.

The GS: prefix points to some Windows internal structure. The 
first element is the Thread Information Block:
Definition from <winnt.h>:

typedef struct _NT_TIB {
     struct _EXCEPTION_REGISTRATION_RECORD *ExceptionList;
     PVOID StackBase;
     PVOID StackLimit;
     PVOID SubSystemTib;
....
}

The saved elements are therefore the exception chain, the stack 
base and the stack limit.

The order of the XMM register is simply dictated by alignment as 
you guessed.

Please note the comment from start of function: if you change the 
stack layout then you also have to change function 
Fiber.initStack because here the initial layout is created.

A context switch with fiber_switchContext looks like a normal 
function call to the compiler. The nonvolatile registers must be 
preserved. That RDI/RSI are not saved is a bug. Compare with the 
function from druntime 2.067: the bug is fixed there.

Regards,
Kai


More information about the digitalmars-d-ldc mailing list