review of std.parallelism

Mon Mar 21 08:54:29 PDT 2011

== Quote from Michel Fortin (michel.fortin at michelf.com)'s article
> > On second thought, no, but for practical, not theoretical reasons:
> > One, you can't introspect whether a foreach loop is using a ref or a
> > value parameter.  This is an issue with how opApply works.
> Indeed a problem. Either we fix the compiler to support that, or we
> change the syntax to something like this:
> 	taskPool.apply(range, (ref int value) {
> 		...
> 	});
> Or we leave things as they are.
> > Two, AFAIK there's no way to get the native word size.
> Right. That's a problem too... you could probably alleviate this by
> doing a runtime check with some fancy instructions to get the native
> word size, but I'd expect that to be rather convoluted.
> I'd like to check if I understand that well. For instance this code:
> 	int[100] values;
> 	foreach (i, ref value; parallel(values))
> 		value = i;
> would normally run fine on a 32-bit processor, but it'd create
> low-level a race on a 64-bit processor (even a 64-bit processor running
> a 32-bit program in 32-bit compatibility mode). And even that is a
> generalization, some 32-bit processors out there *might* have 64-bit
> native words. So the code above isn't portable. Is that right?
> Which makes me think... we need to document those pitfalls somewhere.
> Perhaps std.parallelism's documentation should link to a related page
> about what you can and what you can't do safely. People who read that
> "all the safeties are off" in std.parallelism aren't going to
> understand what you're talking about unless you explain the pitfalls
> with actual examples (like the one above).

This problem is **much** less severe than you are suggesting.  x86 can address
single bytes, so it's not a problem even if you're iterating over bytes on a
64-bit machine.  CPUs like Alpha (which no D compiler even exists for) can't
natively address individual bytes.  Therefore, writing to a byte would be
implemented much like writing to a bit is on x86:  You'd read the full word in,
change one byte, write the full word back.  I'm not sure exactly how it would be
implemented at the compiler level, or whether you'd even be allowed to have a
reference to a byte in such an implementation.  This is why I consider this more a
theoretical problem than a serious practical issue.

> > Unfortunately, I'm going to have to take a hard line on this one.  The
> > issue of integrating std.parallelism into the race safety system had
> > been discussed a bunch in the past and it was basically agreed that
> > std.parallelism is a "here be dragons" module that cannot reasonably be
> > made to conform to such a model.
> About the "cannot reasonably be made to conform to such a model" part:
> that is certainly true today, but might turn out not to be true as the
> model evolves. It certainly is true as long as we don't have shared
> delegates. Beyond that it becomes more fuzzy. The final goal of the
> module shouldn't be to bypass safeties but to provide good parallelism
> primitives. By all means, if safeties can be enabled reasonably they
> should be (but as of today, they can't).

I'll agree that, if the model evolves sufficiently, everything I've said deserves
to be reconsidered.  My comments only apply to the model as it exists currently or
will exist in the foreseeable future.  It's just that I have **serious** doubts
that it will evolve much and I don't want to delay things to have an academic
discussion about what-ifs with regard to this.

> And I totally agree with you that it's quite silly to require casts
> everywhere to use synchronized. I'll be the first to admit that it's
> hard to see synchronized classes being very practical as they're
> implemented today. There's room for improvements there too.

This is the point I'm trying to make:  Requiring these casts is perfectly
reasonable if you assume that shared state is going to be rare, as it is in the
std.concurrency model.  When the user is looking for a more fine-grained, more
shared state-heavy paradigm, I don't think it's possible to make it safe without
also making it virtually unusable.  This is why we punted and decided to go with
message passing and very limited shared data as the flagship multithreading model.
 On the other hand, in a systems language, we can't virtually prohibit shared
state-heavy multithreading by making it virtually unusable.  Thus, the no low
level race guarantee needs to apply to std.concurrency only.