std.concurrency and efficient returns

Robert Jacques sandford at jhu.edu
Sun Aug 1 08:55:54 PDT 2010


On Sun, 01 Aug 2010 06:24:18 -0400, Jonathan M Davis  
<jmdavisprog at gmail.com> wrote:
> Okay. From what I can tell, it seems to be a recurring pattern with  
> threads that
> it's useful to spawn a thread, have it do some work, and then have it  
> return the
> result and terminate. The appropriate way to do that seems to spawn the  
> thread
> with the data that needs to be passed and then using send to send what  
> would
> normally be the return value before the function (and therefore the  
> spawned
> thread) terminates. I see 2 problems with this, both stemming from  
> immutability.
>
> 1. _All_ of the arguments passed to spawn must be immutable. It's not  
> that hard
> to be in a situation where you need to pass it arguments that the parent  
> thread
> will never use, and it's highly probable that that data will have to be  
> copied
> to make it immutable so that it can be passed. The result is that you're  
> forced
> to make pointless copies. If you're passing a lot of data, that could be
> expensive.
>
> 2. _All_ of the arguments returned via send must be immutable. In the  
> scenario
> that I'm describing here, the thread is going away after sending the  
> message, so
> there's no way that it's going to do anything with the data, and having  
> to copy
> it to make it immutable (as will likely have to be done) can be highly
> inefficient.
>
> Is there a better way to do this? Or if not, can one be created? It  
> seems to me
> that it would be highly desirable to be able to pass mutable reference  
> types
> between threads where the thread doing the receiving takes control of the
> object/array being passed. Due to D's threading model, a copy may still  
> have to
> be done behind the scenes, but if you could pass mutable data across  
> while
> passing ownership, you could have at most 1 copy rather than the 2 - 3  
> copies
> that would have to be taking place when you have a mutable obect that  
> you're
> trying to send across threads (so, one copy to make it immutable,  
> possibly a
> copy from one thread local storage to another of the immutable data  
> (though I'd
> hope that that wouldn't require a copy), and one copy on the other end  
> to get
> mutable data from the immutable data). As it stands, it seems painfully
> inefficient to me when you're passing anything other than small amounts  
> of data
> across.
>
> Also, this recurring pattern that I'm seeing makes me wonder if it would  
> be
> advantageous to have an addititon to std.concurrency where you spawned a  
> thread
> which returned a value when it was done (rather than having to use a  
> send with a
> void function), and the parent thread used a receive call of some kind  
> to get
> the return value. Ideally, you could spawn a series of threads which  
> were paired
> with the variables that their return values would be assigned to, and  
> you could
> do it all as one function call.
>
> Overall, I really like D's threading model, but it seems to me that it  
> could be
> streamlined a bit.
>
> - Jonathan M Davis

Hi Jonathan,
It sounds like what you really want is a task-based parallel programming  
library, as opposed to concurrent thread. I'd recommend Dave Simcha's  
parallelFuture library if you want to play around with this in D  
(http://www.dsource.org/projects/scrapple/browser/trunk/parallelFuture/parallelFuture.d).  
However, parallelFuture is currently unsafe - you need to make sure that  
logically speaking that data the task is being passed is immutable.  
Shared/const/immutable delegates have been brought up before as a way to  
formalize the implicit assumptions of libraries like parallelFuture, but  
nothing has come of it yet.
As for std.concurrency, immutability is definitely the correct way to go,  
even if it means extra copying: for most jobs the processing should  
greatly out way the cost of copying and thread initialization (though  
under the hood thread pools should help with the latter). A large amount  
of experience dictates that shared mutable data, let alone unprotected  
mutable data, is a bug waiting to happen.
On a more practical note, if you relaxing either 1) or 2) can cause major  
problems with certain modern GCs, so at a minimum casts should be involved.


More information about the Digitalmars-d mailing list