review of std.parallelism

Mon Mar 21 06:50:09 PDT 2011

On 3/21/2011 8:37 AM, Michel Fortin wrote:
> Well, it'll work irrespective of whether shared delegates are used or
> not. I think you could add a compile-time check that the array element
> size is a multiple of the word size when the element is passed by ref in
> the loop and leave the clever trick as a possible future improvements.
> Would that work?

On second thought, no, but for practical, not theoretical reasons:  One, 
you can't introspect whether a foreach loop is using a ref or a value 
parameter.  This is an issue with how opApply works.  Two, AFAIK there's 
no way to get the native word size.

>>
>> I'd go a little further. If the guarantees that shared was supposed to
>> provide are strong, i.e. apply no matter what threading module is
>> used, then I utterly despise it. It's one of the worst decisions made
>> in the design of D. Making things pedantically strict, so that the
>> type system gets in the way more than it helps, encourages the user to
>> reflexively circumvent the type system without thinking hard about
>> doing this, thus defeating its purpose. (The alternative of always
>> complying with what the type system "expects" you to do is too
>> inflexible to even be worth considering.) Type systems should err on
>> the side of accepting a superset of what's correct and treating code
>> as innocent until proven guilty, not the other way around. I still
>> believe this even if some of the bugs it could be letting pass through
>> might be very difficult to debug. See the discussion we had a few
>> weeks ago about implicit integer casting and porting code to 64.
>
> I agree with you that this is a serious problem. I think part of why it
> hasn't been talked much yet is that nobody is currently using D2
> seriously for multithreaded stuff at this time (apart from you I guess),
> so we're missing experience with it. Andrei seems to think it's fine to
> required casts as soon as you need to protect something beyond an
> indirection inside synchronized classes, with the mitigation measure
> that you can make classes share their mutex (not implemented yet I
> think) so if the indirection leads to a class it is less of a problem.
> Personally, I don't.
>
>
>> My excuse for std.parallelism is that it's pedal-to-metal parallelism,
>> so it's more acceptable for it to be dangerous than general case
>> concurrency. IMHO when you use the non- at safe parts of std.parallelism
>> (i.e. most of the library), that's equivalent to casting away shared
>> in a whole bunch of places. Typing "import std.parallelism;" in a
>> non- at safe module is an explicit enough step here.
>
> I still think this "pedal-to-metal" qualification needs to be justified.
> Not having shared delegates in the language seems like an appropriate
> justification to me. Wanting to bypass casts you normally have to do
> around synchronized as the sole reason seems like a bad justification to
> me.
>
> It's not that I like how synchronized works, it's just that I think it
> should work the same everywhere.

This is where you and I disagree.  I think that the type system's 
guarantees should be weak, i.e. only apply to std.concurrency.  IMHO the 
strictness is reasonable when using message passing as your primary 
method of multithreading and only very little shared state.  However, 
it's completely unreasonable if you want to use a paradigm where shared 
state is more heavily used.  D, being a systems language, needs to allow 
other styles of multithreading without making them a huge PITA that 
requires casts everywhere.

>> The guarantee is still preserved that, if you only use std.concurrency
>> (D's flagship "safe" concurrency module) for multithreading and don't
>> cast away shared, there can be no low level data races. IMHO this is
>> still a substantial accomplishment in that there exists a way to do
>> safe, statically checkable concurrency in D, even if it's not the
>> **only** way concurrency can be done. BTW, core.thread can also be
>> used to get around D's type system, not just std.parallelism. If you
>> want to check that only safe concurrency is used, importing
>> std.parallelism and core.thread can be grepped just as easily as
>> casting away shared.
>
> Unless I'm mistaken, the only thing that bypasses race-safety in
> core.thread is the Thread constructor that takes a delegate. Which means
> it could easily be made race-safe by making that delegate parameter
> shared (once shared delegates are implemented).

And then you'd only need one cast to break it if you wanted to, not 
casts everywhere.  Just cast an unshared delegate to shared when passing 
it to core.thread.

>
>> If, on the other hand, the guarantees of shared are supposed to be
>> weak in that they only apply to programs where only std.concurrency is
>> used for multithreading, then I think strictness is the right thing to
>> do. The whole point of std.concurrency is to give strong guarantees,
>> but if you prefer more dangerous but more flexible multithreading,
>> other paradigms should be readily available.
>
> I think the language as a whole is designed to have strong guaranties,
> otherwise synchronized classes wouldn't require out-of-guaranty casts at
> every indirection.

Well, there's an easy way around that, too.  Just declare the whole 
method body synchronized, but don't declare the method synchronized in 
the signature.

>
> I'm not too pleased with the way synchronized classes are supposed to
> work, nor am I too pleased with how it impacts the rest of the language.
> But if this is a problem (and I think it is), it ought to be fixed
> globally, not by shutting down safeties in every module dealing with
> multithreading that isn't std.concurrency.
>

Again, I completely disagree.  IMHO it's not fixable globally such that 
both of the following are achieved:

1.  The strong guarantees when using only std.concurrency are preserved.

2.  More shared state-intensive multithreading can be done without the 
type system getting in the way more than it helps.

>
>> I'm **still** totally confused about how shared is supposed to work,
>> because I don't have a fully debugged/implemented implementation or
>> good examples of stuff written in this paradigm to play around with.
>
> I think nobody have played much with the paradigm at this point, or we'd
> have heard some feedback. Well, actually we have your feedback which
> seem to indicate that it's better to shut off safeties than to play nice
> with them.
>

I agree that it's better to shut off the safeties **unless** you're 
doing very coarse-grained multithreading with very little shared state 
like std.concurrency had in mind.  If this is what you want, then the 
safeties are great.

Unfortunately, I'm going to have to take a hard line on this one.  The 
issue of integrating std.parallelism into the race safety system had 
been discussed a bunch in the past and it was basically agreed that 
std.parallelism is a "here be dragons" module that cannot reasonably be 
made to conform to such a model.  Given the choice (and I hope I'm not 
forced to make this choice) I'd rather std.parallelism be a third-party 
module that's actually usable than a Phobos module that is such a PITA 
to use that noone does in practice.