Good demo for showing benefits of parallelism
Jascha Wetzel
"[firstname]" at mainia.de
Fri Jan 26 17:27:01 PST 2007
i had a simple, unoptimized raytracer from an old CG assignment lying
around that i ported to D and modified for parallel computation.
download here: http://mainia.de/prt.zip
on a single core/cpu machine, the time stays about the same until 4
threads, then the overhead kicks in.
here are some values from an opteron dual core system running linux
kernel 2.6 for different thread counts:
thrds seconds
1 32.123
2 32.182
3 29.329
4 28.556
8 21.661
16 20.186
24 20.423
32 21.410
these aren't quite what i expected. CPU usage shows that both cores get
about 55-80% load with 2 threads, stablilizing at 65-75% with 16
threads. with a single thread it's clearly 100% on one core.
am i missing something about the pthread lib or memory usage/sync that
prevents reasonable speedup? 160% is nice, but why does it take 16
threads to get there? and where exactly do the remaining 40% go?
Bill Baxter wrote:
> I don't remember where it was, but somebody in some thread said
> something like gee it would be great if we had a demo that showed the
> benefits of having parallel threads. (It was probably in either the
> Futures lib thread or in the discussion about varargs_reduce).
>
> Anyway, a raytracer is a perfect example of "embarrasingly
> parallelizeable" code. In the simplest case you can state it simply as
> "for each pixel do trace_ray(ray_dir)".
>
> Bradley Smith posted code for a little raytracer over on D.learn under
> the heading "Why is this code slower than C++". If anyone is interested
> in turning this into a parallel raytracer that would be a nice little demo.
>
> Along the same lines, almost any image processing algorithm is in the
> same category where the top loop looks like "foreach pixel do ___". This
> is a big part of why GPUs just keep getting faster and faster, because
> the basic problem is just so inherently parallelizable.
>
> --bb
More information about the Digitalmars-d
mailing list