Advice on threading/fibers/?

Wed Jun 15 18:44:34 PDT 2011

On 2011-06-15 16:57, Justin Whear wrote:
> Consider the following:
> 
> You have 10 million data points and you need to apply a multipass algorithm
> to them. Each pass is like a cellular automata: it can read from the
> previous pass but it doesn't know the "current" values. This makes the
> actual processing of each value trivially parallelizable. The actual
> operation for each value is fairly simple and cheap (essentially a
> multidimensional ancestor-child averaging operation).
> 
> After each value has been operated on once, the pass is complete and the
> "current" and "old" buffers are switched (conceptually, the "current"
> buffer can only be written to, the "old" buffer can only be read--using
> __gshared here).
> 
> The number of passes is not fixed; in the course of each value operation,
> an error is computed. When the worst individual error falls below a
> certain threshold, the algorithm is finished. Generally this will take
> between one thousand and ten thousand passes.
> 
> How would you go about parallelizing this? My thought is to take the
> map/reduce approach within each pass: each thread/fiber takes a slice of
> the dataset, makes its modifications, then returns an error summary. These
> summaries are quickly combined and the algorithm loop decides whether to
> run again. Each pass shouldn't take more than a second or two, so I'm not
> sure whether introducing the overhead of spawning, say, 10 threads each
> pass is worthwhile (times 5000 passes). On the other hand, I have plenty
> of CPUs to throw at it (at least 16 cores, each with hyperthreading) and
> am in a situation where "as fast as possible" is important (while
> individual datasets may not grow, the number of them is).
> 
> Any thoughts appreciated.

You should take a look at std.parallelism.

- Jonathan M Davis