review of std.parallelism
dsimcha
dsimcha at yahoo.com
Fri Mar 18 21:40:08 PDT 2011
Thanks for the advice. You mentioned in the past that the documentation
was inadequate but didn't give enough specifics as to how until now. As
the author of the library, things seem obvious to me that don't seem
obvious to anyone else, so I don't feel that I'm in a good position to
judge the quality of the documentation and where it needs improvement.
I plan to fix most of the issues you raised, but I've left comments for
the few that I can't/won't fix or believe are based on misunderstandings
below.
On 3/18/2011 11:29 PM, Andrei Alexandrescu wrote:
> 1. Library proper:
>
> * "In the case of non-random access ranges, parallel foreach is still
> usable but buffers lazily to an array..." Wouldn't strided processing
> help? If e.g. 4 threads the first works on 0, 4, 8, ... second works on
> 1, 5, 9, ... and so on.
You can have this if you want, by setting the work unit size to 1.
Setting it to a larger size just causes more elements to be buffered,
which may be more efficient in some cases.
>
> * I'm unclear on the tactics used by lazyMap. I'm thinking the obvious
> method should be better: just use one circular buffer. The presence of
> two dependent parameters makes this abstraction difficult to operate with.
>
> * Same question about asyncBuf. What is wrong with a circular buffer
> filled on one side by threads and on the consumed from the other by the
> client? I can think of a couple of answers but it would be great if they
> were part of the documentation.
Are you really suggesting I give detailed rationales for implementation
decisions in the documentation? Anyhow, the two reasons for this choice
are to avoid needing synchronization/atomic ops/etc. on every write to
the buffer (which we would need since it can be read and written
concurrently and we need to track whether we have space to write to) and
because parallel map works best when it operates on relatively large
buffers, resulting in minimal synchronization overhead per element.
(Under the hood, the buffer is filled and then eager parallel map is
called.)
> * Why not make workerIndex a ulong and be done with it?
I doubt anyone's really going to create anywhere near 4 billion TaskPool
threads over the lifetime of a program. Part of the point of TaskPool
is recycling threads rather than paying the overhead of creating and
destroying them. Using a ulong on a 32-bit architecture would make
worker-local storage substantially slower. workerIndex is how
worker-local storage works under the hood, so it needs to be fast.
> * No example for workerIndex and why it's useful.
It should just be private. The fact that it's public is an artifact of
when I was designing worker-local storage and didn't know how it was
going to work yet. I never thought to revisit this until now. It
really isn't useful to client code.
> * Is stop() really trusted or just unsafe? If it's forcibly killing
> threads then its unsafe.
It's not forcibly killing threads. As the documentation states, it has
no effect on jobs already executing, only ones in the queue.
Furthermore, it's needed unless makeDaemon is called. Speaking of
which, makeDaemon and makeAngel should probably be trusted, too.
> * defaultPoolThreads - should it be a @property?
Yes. In spirit it's a global variable. It requires some extra
machinations, though, to be threadsafe, which is why it's not
implemented as a simple global variable.
> * No example for task().
???? Yes there is, for both flavors, though these could admittedly be
improved. Only the safe version doesn't have an example, and this is
just a more restricted version of the function pointer case, so it seems
silly to make a separate example for it.
> * What is 'run' in the definition of safe task()?
It's just the run() adapter function. Isn't that obvious?
More information about the Digitalmars-d
mailing list