std.parallelism: Final review

dsimcha dsimcha at yahoo.com
Sat Mar 19 07:45:12 PDT 2011


On 3/19/2011 9:37 AM, Michel Fortin wrote:
> On 2011-03-18 22:27:14 -0400, dsimcha <dsimcha at yahoo.com> said:
>
>> I think your use case is both beyond the scope of std.parallelism and
>> better handled by std.concurrency. std.parallelism is mostly meant to
>> handle the pure multicore parallelism use case. It's not that it
>> **can't** handle other use cases, but that's not what it's tuned for.
>
> I know. But if this gets its way in the standard library, perhaps it
> should aim at reaching a slightly wider audience? Especially since it
> lacks so little to become more general purpose...

Fair enough.  You've convinced me, since I've just recently started 
pushing std.parallelism in this direction in both my research work and 
in some of the examples I've been using, and you've given very good 
specific suggestions about **how** to expand things a little.

>
>
>> As far as prioritization, it wouldn't be hard to implement
>> prioritization of when a task starts (i.e. have a high- and
>> low-priority queue). However, the whole point of TaskPool is to avoid
>> starting a new thread for each task. Threads are recycled for
>> efficiency. This prevents changing the priority of things in the OS
>> scheduler. I also don't see how to generalize prioritization to map,
>> reduce, parallel foreach, etc. w/o making the API much more complex.
>
> I was not talking about thread priority, but ordering priority (which
> task gets chosen first). I don't really care about thread priority in my
> application, and I understand that per-task thread priority doesn't make
> much sense. If I needed per-task thread priority I'd simply make pools
> for the various thread priorities and put tasks in the right pools.
>
> That said, perhaps I could do exactly that: create two or three pools
> with different thread priorities, put tasks into the right pool and let
> the OS sort out the scheduling. But then the question becomes: how do I
> choose the thread priority of a task pool? I doesn't seem possible from
> the documentation. Perhaps TaskPool's constructor should have a
> parameter for that.
>

This sounds like a good solution.  The general trend I've seen is that 
the ability to create >1 pools elegantly solves a lot of problems that 
would be a PITA from both an interface and an implementation perspective 
to solve more directly.  I've added a priority property to TaskPool that 
allows setting the OS priority of the threads in the pool.  This just 
forwards to core.thread.priority(), so usage is identical.

> - - -
>
> Another remarks: in the documentation for the TaskPool constructor, it
> says:
>
> ""Default constructor that initializes a TaskPool with one worker thread
> for each CPU reported available by the OS, minus 1 because the thread
> that initialized the pool will also do work.""
>
> This "minus 1" thing doesn't really work for me. It certainly make sense
> for a parallel foreach use case -- whenever the current thread would
> block until the work is done you can use that thread to work too -- but
> in my use case I delegate all the work to other threads because my main
> thread isn't a dedicated working thread and it must not block. I'd be
> nice to have a boolean parameter for the constructor to choose if the
> main thread will work or not (and whether it should do minus 1 or not).
>
> For the global taskPool, I guess I would just have to write
> "defaultPoolThreads = defaultPoolThreads+1" at the start of the program
> if the main thread isn't going to be working.
>
>

I've solved this, though in a slightly different way.  Based on 
discussions on this newsgroup I had recently added an osReportedNcpu 
variable to std.parallelism instead of using core.cpuid.  This is an 
immutable global variable that is set in a static this() statement.

Since we don't know what the API for querying stuff like this should be, 
I had made it private.  I changed it to public.  I realized that, even 
if a more full-fledged API is added at some point for this stuff, there 
should be an obvious, convenient way to get it directly from 
std.parallelism anyhow, and it would be trivial to call whatever API 
eventually evolves to set this value.  Now, if you don't like the -1 
thing, you can just do:

auto pool = new TaskPool(osReportedNcpu);

or

defaultPoolThreads = osReportedNcpu;


More information about the Digitalmars-d mailing list