std.concurrency and efficient returns

Sun Aug 1 12:25:38 PDT 2010

On 01/08/2010 19:17, dsimcha wrote:
> == Quote from Jonathan M Davis (jmdavisprog at gmail.com)'s article
>> Okay. From what I can tell, it seems to be a recurring pattern with threads that
>> it's useful to spawn a thread, have it do some work, and then have it return the
>> result and terminate. The appropriate way to do that seems to spawn the thread
>> with the data that needs to be passed and then using send to send what would
>> normally be the return value before the function (and therefore the spawned
>> thread) terminates. I see 2 problems with this, both stemming from immutability.
>
> I think the bottom line is that D's threading model is designed to put safety and
> simplicity over performance and flexibility.  Given the amount of bugs that are
> apparently generated when using threading for concurrency in large-scale software
> written by hordes of programmers, this may be a reasonable tradeoff.
>
> Within the message-passing model, one thing that would help a lot is a Unique type
> that can be implicitly and destructively converted to immutable or shared.  In D
> as it stands right now, immutable is basically useless in all but the simplest
> cases because it's just too hard to build complex immutable data structures,
> especially if you want to avoid unnecessary copying or having to rely on casts and
> manually checked assumptions in at least small areas of the program.  In theory,
> immutable solves tons of problems, but in practice it solves very few.  While I
> don't understand shared that well, I guess a Unique type would help in creating
> shared data, too.
>
> There are two reasons for using multithreading:  Parallelism (using multiple cores
> to increase throughput) and concurrency (making things appear to be happening
> simultaneously to decrease latency; this makes sense even on a single-core
> machine).  One may come as a side effect of the other, but usually only one is the
> goal.  It sounds like you're looking for parallelism.  When using threading for
> parallelism as opposed to concurrency, this tradeoff of simplicity and safety in
> exchange for flexibility and performance doesn't work so well because:
>
> 1.  When using threading for parallelism instead of concurrency, it's reasonable
> to do some unsafe stuff to get better performance, since performance is the whole
> point anyhow.
>
> 2.  Unlike the concurrency case, the parallelism case usually occurs only in small
> hotspots of a program, or in small scientific computing programs.  In these cases
> it's not that hard for the programmer to manually track what's shared, etc.
>
> 3.  In my experience at least, parallelism often requires finer grained
> communication between threads than concurrency.  For example, an OS timeslice is
> about 15 milliseconds, meaning that on single core machines threads being used for
> concurrency simply can't communicate more often than that.  I've written useful
> parallel code that scaled to at least 4 cores and required communication between
> threads several times per millisecond.  It could have been written more
> efficiently w.r.t. communication between threads, but it would have required a lot
> more memory allocations and been less efficient in other respects.
>
> While I completely agree that message passing should be D's **flagship** threading
> model because it's been proven to work well in a lot of cases, I'm not sure if it
> should be the **only** one well-supported out of the box because it's just too
> inflexible when you want pull-out-all-stops parallelism.  As Robert Jacques
> mentioned, I've been working on a parallelism library.  The code is at:
>
> http://dsource.org/projects/scrapple/browser/trunk/parallelFuture/parallelFuture.d
>
> The docs are at:
>
> http://cis.jhu.edu/~dsimcha/parallelFuture.html
>
> I've been thinking lately about how to integrate this into the new threading
> model, as it's currently completely unsafe, doesn't use shared at all, and was
> written before the new threading model was implemented.  (core.thread still takes
> an unshared delegate).  I think before we can solve the problems you've brought
> up, we need to clarify how non-message passing based multithreading (i.e. using
> shared) is going to work in D, as right now it is completely unclear at least to me.

I completely agree with everything you said and I really dislike how D2 
currently seems to virtually impose an application architecture based on 
the message passing model if you don't want to circumvent and thus break 
the entire type system. While I do agree that message passing makes a 
lot of sense as the default choice, there also has to be well 
thought-out and extensive support for the shared memory model if D2 is 
really focusing on the concurrency issue as much as it claims.

Personally, I've found hybrid architectures where both models are 
combined as needed to be the most flexible and best performing approach 
and there is no way a language touted to be a systems language should 
impose one model over the other and stop the programmer from doing 
things the way he wants.

/Max