Scalability in std.parallelism
Russel Winder
russel at winder.org.uk
Tue Feb 25 08:06:53 PST 2014
On Tue, 2014-02-25 at 12:33 +0000, thedeemon wrote:
> On Monday, 24 February 2014 at 14:34:14 UTC, Russel Winder wrote:
>
> > Two cores with hyperthreads generally means a maximum speed up
> > of 2 with optimized native code.
>
> Not true. If the code is not trivial and the threads are not
> doing exactly same instructions (i.e. they can do some search
> where number of operations depends on data) then 2 cores x 2
> hyperthreads can easily provide more than 2x speed up (but far
> from 4x of course). I see it very often in my video processing
> code.
I suspect the issue here is how compute intensive the code is, are there
cache line misses, are there requests out to memory, etc. i.e.
non-trivial. My observation and gross over-simplification stems from CPU
bound jobs with very localized data, no need for memory writes, and
definitely no I/O. This leads to no opportunity for the hyperthreads to
contribute anything.
I would guess that your video processing uses a cache-friendly
(streaming?) algorithm so that the hyperthreads can operate with data
already in cache whilst the other gets more data into the cache. This
could easily get a >2x speed up on 2 core, 2 hyperthreads machine if the
data chunks are of a suitable size and the memory reads and writes are
in good rhythm with the calculation done on the data.
--
Russel.
=============================================================================
Dr Russel Winder t: +44 20 7585 2200 voip: sip:russel.winder at ekiga.net
41 Buckmaster Road m: +44 7770 465 077 xmpp: russel at winder.org.uk
London SW11 1EN, UK w: www.russel.org.uk skype: russel_winder
More information about the Digitalmars-d-learn
mailing list