Scalability in std.parallelism

Tue Feb 25 08:06:53 PST 2014

On Tue, 2014-02-25 at 12:33 +0000, thedeemon wrote:
> On Monday, 24 February 2014 at 14:34:14 UTC, Russel Winder wrote:
> 
> > Two cores with hyperthreads generally means a maximum speed up 
> > of 2 with optimized native code.
> 
> Not true. If the code is not trivial and the threads are not 
> doing exactly same instructions (i.e. they can do some search 
> where number of operations depends on data) then 2 cores x 2 
> hyperthreads can easily provide more than 2x speed up (but far 
> from 4x of course). I see it very often in my video processing 
> code.

I suspect the issue here is how compute intensive the code is, are there
cache line misses, are there requests out to memory, etc. i.e.
non-trivial. My observation and gross over-simplification stems from CPU
bound jobs with very localized data, no need for memory writes, and
definitely no I/O. This leads to no opportunity for the hyperthreads to
contribute anything.

I would guess that your video processing uses a cache-friendly
(streaming?) algorithm so that the hyperthreads can operate with data
already in cache whilst the other gets more data into the cache. This
could easily get a >2x speed up on 2 core, 2 hyperthreads machine if the
data chunks are of a suitable size and the memory reads and writes are
in good rhythm with the calculation done on the data.

-- 
Russel.
=============================================================================
Dr Russel Winder      t: +44 20 7585 2200   voip: sip:russel.winder at ekiga.net
41 Buckmaster Road    m: +44 7770 465 077   xmpp: russel at winder.org.uk
London SW11 1EN, UK   w: www.russel.org.uk  skype: russel_winder