[phobos] std.parallelism's unit tests randomly hang on win32

David Simcha dsimcha at gmail.com
Wed May 4 06:51:28 PDT 2011


I went a slightly different route and tried to reduce the problem to as
small a test case as possible, like I would normally do for a compiler bug.
So far I've managed to reduce it to ~560 lines.  I've discovered this one's
more unstable (i.e. the results change a lot more in response to slight
perturbations) than I thought.  Just changing the layout of the Task struct
(deleting member variables that are no longer used anywhere) makes it go
from unit test failures to access violations. Adding or removing try/catch
blocks or empty destructors in some places can completely prevent the bug
from manifesting.  On Linux, if I perturb things slightly by changing the
layout of Task, I get exceptions thrown from core.sync.

This looks like some kind of memory/stack corruption bug but due to its
nondeterminism (only a few thread interleavings seem to take the proper
codepath and I'm not sure which ones) and its very indirect manifestation
(memory corruption; the low order bit overwriting thing was, I think, just a
manifestation of a deeper problem), I am somewhat at a loss for how to debug
it.  I've scrutinized the concurrency related aspects and still can't find
any bugs there.  However, I can't prove it's not a concurrency bug since
running in single threaded mode prevents certain code paths from being
taken.  Unless I get some advice that changes things, I think my next move
is to compare the disassemblies for cases that work to those for cases that
don't.

On Tue, May 3, 2011 at 1:00 PM, Walter Bright <walter at digitalmars.com>wrote:

>
>
> On 5/3/2011 5:43 AM, David Simcha wrote:
>
>>
>>  Add asserts on that pointer value going out of range, and keep working
>>> backwards until the point where the value goes wrong is discovered.
>>>
>>
>>
>> Been trying to do that, but I think there are multiple places where this
>> is happening and the asserts are affecting codegen or timings just enough to
>> prevent some.
>>
>
> You can also do the simple:
>
>    if (ptr == bad value) *((char*)0)=0;
>
> which doesn't perturb timings or code gen much. I use these often. The
> debugger will tell you which one tripped.
>
> _______________________________________________
> phobos mailing list
> phobos at puremagic.com
> http://lists.puremagic.com/mailman/listinfo/phobos
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.puremagic.com/pipermail/phobos/attachments/20110504/44480ad8/attachment.html>


More information about the phobos mailing list