I went a slightly different route and tried to reduce the problem to as small a test case as possible, like I would normally do for a compiler bug. So far I've managed to reduce it to ~560 lines. I've discovered this one's more unstable (i.e. the results change a lot more in response to slight perturbations) than I thought. Just changing the layout of the Task struct (deleting member variables that are no longer used anywhere) makes it go from unit test failures to access violations. Adding or removing try/catch blocks or empty destructors in some places can completely prevent the bug from manifesting. On Linux, if I perturb things slightly by changing the layout of Task, I get exceptions thrown from core.sync.<br>
<br>This looks like some kind of memory/stack corruption bug but due to its nondeterminism (only a few thread interleavings seem to take the proper codepath and I'm not sure which ones) and its very indirect manifestation (memory corruption; the low order bit overwriting thing was, I think, just a manifestation of a deeper problem), I am somewhat at a loss for how to debug it. I've scrutinized the concurrency related aspects and still can't find any bugs there. However, I can't prove it's not a concurrency bug since running in single threaded mode prevents certain code paths from being taken. Unless I get some advice that changes things, I think my next move is to compare the disassemblies for cases that work to those for cases that don't.<br>
<br><div class="gmail_quote">On Tue, May 3, 2011 at 1:00 PM, Walter Bright <span dir="ltr"><<a href="mailto:walter@digitalmars.com">walter@digitalmars.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">
<div class="im"><br>
<br>
On 5/3/2011 5:43 AM, David Simcha wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
Add asserts on that pointer value going out of range, and keep working backwards until the point where the value goes wrong is discovered.<br>
</blockquote>
<br>
<br>
Been trying to do that, but I think there are multiple places where this is happening and the asserts are affecting codegen or timings just enough to prevent some.<br>
</blockquote>
<br></div>
You can also do the simple:<br>
<br>
if (ptr == bad value) *((char*)0)=0;<br>
<br>
which doesn't perturb timings or code gen much. I use these often. The debugger will tell you which one tripped.<div><div></div><div class="h5"><br>
_______________________________________________<br>
phobos mailing list<br>
<a href="mailto:phobos@puremagic.com" target="_blank">phobos@puremagic.com</a><br>
<a href="http://lists.puremagic.com/mailman/listinfo/phobos" target="_blank">http://lists.puremagic.com/mailman/listinfo/phobos</a><br>
</div></div></blockquote></div><br>