Floating Point + Threads?

dsimcha dsimcha at yahoo.com
Fri Apr 15 20:22:04 PDT 2011


I'm trying to debug an extremely strange bug whose symptoms appear in a 
std.parallelism example, though I'm not at all sure the root cause is in 
std.parallelism.  The bug report is at 
https://github.com/dsimcha/std.parallelism/issues/1#issuecomment-1011717 .

Basically, the example in question sums up all the elements of a lazy 
range (actually, std.algorithm.map) in parallel.  It uses 
taskPool.reduce, which divides the summation into work units to be 
executed in parallel.  When executed in parallel, the results of the 
summation are non-deterministic after about the 12th decimal place, even 
though all of the following properties are true:

1.  The work is divided into work units in a deterministic fashion.

2.  Within each work unit, the summation happens in a deterministic order.

3.  The final summation of the results of all the work units is done in 
a deterministic order.

4.  The smallest term in the summation is about 5e-10.  This means the 
difference across runs is about two orders of magnitude smaller than the 
smallest term.  It can't be a concurrency bug where some terms sometimes 
get skipped.

5.  The results for the individual tasks, not just the final summation, 
differ in the low-order bits.  Each task is executed in a single thread.

6.  The rounding mode is apparently the same in all of the threads.

7.  The bug appears even on machines with only one core, as long as the 
number of task pool threads is manually set to >0.  Since it's a single 
core machine, it can't be a low level memory model issue.

What could possibly cause such small, non-deterministic differences in 
floating point results, given everything above?  I'm just looking for 
suggestions here, as I don't even know where to start hunting for a bug 
like this.


More information about the Digitalmars-d mailing list