std.parallelism: Request for Review

dsimcha dsimcha at yahoo.com
Sat Feb 26 14:40:39 PST 2011


I have no idea why the euclidean benchmark shows a superlinear speedup 
without -release, though I'm able to reproduce this on my box.  Must 
have something to do with std.algorithm's use of asserts or something.

As far as operating systems, I'm glad you tested on XP32.  One thing 
that can make a **huge** difference is that, on XP, synchronized blocks 
immediately hit kernel calls and context switches unless you use the 
Windows API directly to explicitly override this behavior.  On Vista and 
7, the default behavior (which D uses) is to spin for a short period of 
time before context switching when waiting on a lock.  This is usually 
vastly more efficient in the case of heavily contested, fine grained 
locking.  I tested on Windows 7 and I'm very happy that none of the 
numbers completely blew up on XP because of this issue.

On 2/26/2011 5:30 PM, Andrej Mitrovic wrote:
> Without release, only the euclidean benchmark shows a more dramatic
> speed difference:
> Serial reduce:  6298 milliseconds.
> Parallel reduce with 4 cores:  567 milliseconds.
>
> I forgot to mention I'm on XP32. I could test these on a virtualized
> Linux, if that's worth testing.



More information about the Digitalmars-d mailing list