review of std.parallelism

Andrei Alexandrescu SeeWebsiteForEmail at erdani.org
Fri Mar 18 20:29:50 PDT 2011


0. Overview and vote

I think the library delivers the high-level goods (parallel foreach, 
map, reduce) but is a bit fuzzy in the lower level details.

Documentation hurts understanding of the capabilities of the library and 
essentially is of inadequate quality. Entity documentation and examples 
do little more than give the impression of dispensing with an unpleasant 
chore.

My vote in favor of acceptance is contingent upon a radical improvement 
in the quality of documentation and examples. Most if not all artifacts 
should be motivated by simple, compelling examples. The introductory 
section must contain a brief and attractive synopsis of the flagship 
capabilities. All terms must be defined before being used and introduced 
in a carefully-chosen order. The relationship between various entities 
should be clarified.

I've seen the argument that simple and strong examples are difficult to 
find. Though I agree such examples are not easy to come up with, I also 
believe the author of library is in the best position to produce them.

================

1. Library proper:

* "In the case of non-random access ranges, parallel foreach is still 
usable but buffers lazily to an array..." Wouldn't strided processing 
help? If e.g. 4 threads the first works on 0, 4, 8, ... second works on 
1, 5, 9, ... and so on.

* Example with squares would be better if double replaced uint, and if a 
more complicated operation (sqrt, log...) were involved.

* I'm unclear on the tactics used by lazyMap. I'm thinking the obvious 
method should be better: just use one circular buffer. The presence of 
two dependent parameters makes this abstraction difficult to operate with.

* Same question about asyncBuf. What is wrong with a circular buffer 
filled on one side by threads and on the consumed from the other by the 
client? I can think of a couple of answers but it would be great if they 
were part of the documentation.

* Why not make workerIndex a ulong and be done with it?

* Is stop() really trusted or just unsafe? If it's forcibly killing 
threads then its unsafe.

* uint size() should be size_t for conformity.

* defaultPoolThreads - should it be a @property?

* I don't understand what Task is supposed to do. It is circularly 
defined: "A struct that encapsulates the information about a task, 
including its current status, what pool it was submitted to, and its 
arguments." OK, but what's a task? Could a task be used outside a task 
pool, and if so to what effect(s)?

* If LazyMap is only necessary as the result of lazyMap, could that 
become a local type inside lazyMap?


2. Documentation:

* Documentation unacceptable. It lacks precision and uses terms before 
defining them. For example: "This class encapsulates a task queue and a 
set of worker threads" comes before the notions of "task queue" and 
"worker thread" are defined. Clearly there is an intuition of what those 
are supposed to mean, but in practice each library lays down some fairly 
detailed definition of such.

* "Note: Initializing a pool with zero threads (as would happen in the 
case of a single-core CPU) is well-tested and does work." The absence of 
a bug should not be advertised in so many words. Simpler: "Note: 
Single-CPU machines will operate transparently with zero-sized pools."

* "Allows for custom pool size." I have no idea what pool size means.

* "// Do something interesting." Worst example ever. You are the creator 
of the library. If _you_ can't find a brief compelling example, who could?

* "Immediately after the range argument, an optional block size argument 
may be provided. If none is provided, the default block size is used. An 
optional buffer for returining the results may be provided as the last 
argument. If one is not provided, one will be automatically allocated. 
If one is provided, it must be the same length as the range." An example 
should be inserted after each of these sentences.

* "// Do something expensive with line." Better you do something 
expensive with line.

* "This is not the same thing as commutativity." I think this is obvious 
enough to be left out. The example of matrices (associative but not 
commutative) is nice though.

* "immutable n = 1000000000;" -> use underscores

* Would be great if the documentation examples included some rough 
experimental results, e.g. "3.8 times faster on a 4-core machine than 
the equivalent serial loop".

* No example for workerIndex and why it's useful.

* Is LocalWorker better than WorkerLocal? No, because it's not the 
worker that's local, it's the storage - which is missing from the name! 
WorkerLocalStorage? Also replace "create" with "make" or drop it 
entirely. The example doesn't tell me how I can use bufs. I suspect 
workerIndex has something to do with it but the example fails to divulge 
that relationship.

* No example for join(), which is quite a central primitive.

* No example for put(), which would be interesting to see.

* No example for task().

* No example for run().

* The example for Task uses myFuture without defining it. In case Task 
defines the promise/future idiom, the documentation misses a terrific 
opportunity of clarifying that.

* What is 'run' in the definition of safe task()?

* Why is AsyncBuf's doc far away from the function returning it? Same 
about WorkerLocal.


Andrei


More information about the Digitalmars-d mailing list