std.parallelism changes done

dsimcha dsimcha at yahoo.com
Thu Mar 24 06:36:51 PDT 2011


On 3/24/2011 3:29 AM, Sönke Ludwig wrote:
> Hm depending on the way the pool is used, it might be a better default
> to have the number of threads equal the number of cpu cores. In my
> experience the control thread is mostly either waiting for tasks or
> processing messages and blocking in between so it rarely uses a full
> core, wasting the available computation time in this case.

It's funny, it seems like the task parallelism stuff is getting much 
more attention from the community than the data parallelism stuff.  I 
hardly ever use the task parallelism and use mostly data parallelism. 
I'm inclined to leave this as-is because:

1.  It's definitely the right answer for data parallelism and the task 
parallelism case is much less obvious.

2.  The main thread is utilized in the situation you describe.  As I 
mentioned in a previous post, when a task that has not been started by a 
worker thread yet is forced, it is executed immediately in the thread 
that tried to force it, regardless of its position in the queue.  There 
are two reasons for this:

     a.  It guarantees that there won't be any deadlocks where a task 
waits for another task that's behind it in the queue.

     b.  If you're trying to force a task, then you obviously need the 
results ASAP, so it's an ad-hoc form of prioritization.
>
> However, I'm not really sure if it is like this for the majority of all
> applications or if there are more cases where the control thread will
> continue to do computations in parallel. Maybe we could collect some
> opinions on this?
>
> On another note, I would like to see a rough description on what the
> default workUnitSize is depending on the size of the input. Otherwise it
> feels rather uncomfortable to use this version of parallel().

Hmm, this was there in the old documentation.  Andrei recommended 
against documenting it for one of the cases because it might change.  I 
can tell you that, right now, it's:

1.  Whatever workUnitSize would create TaskPool.size * 4 work units, if 
the range has a length.

2.  512 if the range doesn't have a length.
>
> Another small addition would be to state that the object returned by
> asyncBuf either is an InputRange or which useful methods it might have
> (some kind of progress counter could also be useful here).

I guess this could be a little clearer, but it's really just a plain 
vanilla input range that has a length iff range has a length.  There are 
no other public methods.



More information about the Digitalmars-d mailing list