Simple parallel foreach and summation/reduction
Chris Katko
ckatko at gmail.com
Thu Sep 20 05:34:42 UTC 2018
All I want to do is loop from 0 to [constant] with a for or
foreach, and have it split up across however many cores I have.
ulong sum;
foreach(i; [0 to 1 trillion])
{
//flip some dice using
float die_value = uniform(0F,12F);
if(die_value > [constant]) sum++;
}
writeln("The sum is %d", sum);
However, there are two caveats.:
- One: I can't throw a range of values into an array and foreach
on that like many examples use. Because 1 trillion (counting from
zero) might be a little big for an array. (I'm using 1 trillion
to illustrate a specific bottleneck / problem form.)
- I want to merge the results at the end.
Which means I either need to use mutexes (BAD. NO. BOO. HISS.)
or each "thread" would need to know if it's separate, and then
store their sums in, say, a thread[#].sum variable and then once
all were completed, add those sums together.
I know this is an incredibly simple conceptual problem to solve.
So I feel like I'm missing some huge, obvious, answer for doing
it elegantly in D.
And this just occurred to me, if I had a trillion foreach, will
that make 1 trillion threads? What I want is, IIRC, what OpenMP
does. It divides up your range (blocks of sequential numbers) by
the number of threads. So domain of [1 to 1000] with ten threads
would become workloads on the indexes of [1-100], [101-200],
[201-300], and so on. for each CPU. They each get a 100 element
chunk.
So I guess foreach won't work here for that, will it? Hmmm...
----> But again, conceptually this is simple: I have, say, 1
trillion sequential numbers. I want to assign a "block" (or
"range") to each CPU core. And since their math does not actually
interfer with each other, I can simply sum each core's results at
the end.
Thanks,
--Chris
More information about the Digitalmars-d-learn
mailing list