std.parallelism: Final review
Michel Fortin
michel.fortin at michelf.com
Sat Mar 19 06:37:17 PDT 2011
On 2011-03-18 22:27:14 -0400, dsimcha <dsimcha at yahoo.com> said:
> I think your use case is both beyond the scope of std.parallelism and
> better handled by std.concurrency. std.parallelism is mostly meant to
> handle the pure multicore parallelism use case. It's not that it
> **can't** handle other use cases, but that's not what it's tuned for.
I know. But if this gets its way in the standard library, perhaps it
should aim at reaching a slightly wider audience? Especially since it
lacks so little to become more general purpose...
> As far as prioritization, it wouldn't be hard to implement
> prioritization of when a task starts (i.e. have a high- and
> low-priority queue). However, the whole point of TaskPool is to avoid
> starting a new thread for each task. Threads are recycled for
> efficiency. This prevents changing the priority of things in the OS
> scheduler. I also don't see how to generalize prioritization to map,
> reduce, parallel foreach, etc. w/o making the API much more complex.
I was not talking about thread priority, but ordering priority (which
task gets chosen first). I don't really care about thread priority in
my application, and I understand that per-task thread priority doesn't
make much sense. If I needed per-task thread priority I'd simply make
pools for the various thread priorities and put tasks in the right
pools.
That said, perhaps I could do exactly that: create two or three pools
with different thread priorities, put tasks into the right pool and let
the OS sort out the scheduling. But then the question becomes: how do I
choose the thread priority of a task pool? I doesn't seem possible from
the documentation. Perhaps TaskPool's constructor should have a
parameter for that.
> In addition, std.parallelism guarantees that tasks will be started in
> the order that they're submitted, except that if the results are needed
> immediately and the task hasn't been started yet, it will be pulled out
> of the middle of the queue and executed immediately. One way to get
> the prioritization you need is to just submit the tasks in order of
> priority, assuming you're submitting them all from the same place.
Most of my tasks are background tasks that just need to be done
eventually while others are user-requested tasks which can be requested
at any time in the main thread. Issuing them serially is not really an
option.
> One last thing: As far as I/O goes, AsyncBuf may be useful. This
> allows you to pipeline reading of a file and higher level processing.
> Example:
>
> // Read the lines of a file into memory in parallel with processing
> // them.
> import std.stdio, std.parallelism, std.algorithm;
>
> void main() {
> auto lines = map!"a.idup"(File("foo.txt").byLine());
> auto pipelined = taskPool.asyncBuf(lines);
>
> foreach(line; pipelined) {
> auto ls = line.split("\t");
> auto nums = to!(double[])(ls);
> }
> }
Looks nice, but doesn't really work for what I'm doing. Currently I
have one task per file, each task reading a relatively small file and
then parsing its content.
- - -
Another remarks: in the documentation for the TaskPool constructor, it says:
""Default constructor that initializes a TaskPool with one worker
thread for each CPU reported available by the OS, minus 1 because the
thread that initialized the pool will also do work.""
This "minus 1" thing doesn't really work for me. It certainly make
sense for a parallel foreach use case -- whenever the current thread
would block until the work is done you can use that thread to work too
-- but in my use case I delegate all the work to other threads because
my main thread isn't a dedicated working thread and it must not block.
I'd be nice to have a boolean parameter for the constructor to choose
if the main thread will work or not (and whether it should do minus 1
or not).
For the global taskPool, I guess I would just have to write
"defaultPoolThreads = defaultPoolThreads+1" at the start of the program
if the main thread isn't going to be working.
--
Michel Fortin
michel.fortin at michelf.com
http://michelf.com/
More information about the Digitalmars-d
mailing list