std.concurrency and efficient returns
Robert Jacques
sandford at jhu.edu
Sun Aug 1 08:55:54 PDT 2010
On Sun, 01 Aug 2010 06:24:18 -0400, Jonathan M Davis
<jmdavisprog at gmail.com> wrote:
> Okay. From what I can tell, it seems to be a recurring pattern with
> threads that
> it's useful to spawn a thread, have it do some work, and then have it
> return the
> result and terminate. The appropriate way to do that seems to spawn the
> thread
> with the data that needs to be passed and then using send to send what
> would
> normally be the return value before the function (and therefore the
> spawned
> thread) terminates. I see 2 problems with this, both stemming from
> immutability.
>
> 1. _All_ of the arguments passed to spawn must be immutable. It's not
> that hard
> to be in a situation where you need to pass it arguments that the parent
> thread
> will never use, and it's highly probable that that data will have to be
> copied
> to make it immutable so that it can be passed. The result is that you're
> forced
> to make pointless copies. If you're passing a lot of data, that could be
> expensive.
>
> 2. _All_ of the arguments returned via send must be immutable. In the
> scenario
> that I'm describing here, the thread is going away after sending the
> message, so
> there's no way that it's going to do anything with the data, and having
> to copy
> it to make it immutable (as will likely have to be done) can be highly
> inefficient.
>
> Is there a better way to do this? Or if not, can one be created? It
> seems to me
> that it would be highly desirable to be able to pass mutable reference
> types
> between threads where the thread doing the receiving takes control of the
> object/array being passed. Due to D's threading model, a copy may still
> have to
> be done behind the scenes, but if you could pass mutable data across
> while
> passing ownership, you could have at most 1 copy rather than the 2 - 3
> copies
> that would have to be taking place when you have a mutable obect that
> you're
> trying to send across threads (so, one copy to make it immutable,
> possibly a
> copy from one thread local storage to another of the immutable data
> (though I'd
> hope that that wouldn't require a copy), and one copy on the other end
> to get
> mutable data from the immutable data). As it stands, it seems painfully
> inefficient to me when you're passing anything other than small amounts
> of data
> across.
>
> Also, this recurring pattern that I'm seeing makes me wonder if it would
> be
> advantageous to have an addititon to std.concurrency where you spawned a
> thread
> which returned a value when it was done (rather than having to use a
> send with a
> void function), and the parent thread used a receive call of some kind
> to get
> the return value. Ideally, you could spawn a series of threads which
> were paired
> with the variables that their return values would be assigned to, and
> you could
> do it all as one function call.
>
> Overall, I really like D's threading model, but it seems to me that it
> could be
> streamlined a bit.
>
> - Jonathan M Davis
Hi Jonathan,
It sounds like what you really want is a task-based parallel programming
library, as opposed to concurrent thread. I'd recommend Dave Simcha's
parallelFuture library if you want to play around with this in D
(http://www.dsource.org/projects/scrapple/browser/trunk/parallelFuture/parallelFuture.d).
However, parallelFuture is currently unsafe - you need to make sure that
logically speaking that data the task is being passed is immutable.
Shared/const/immutable delegates have been brought up before as a way to
formalize the implicit assumptions of libraries like parallelFuture, but
nothing has come of it yet.
As for std.concurrency, immutability is definitely the correct way to go,
even if it means extra copying: for most jobs the processing should
greatly out way the cost of copying and thread initialization (though
under the hood thread pools should help with the latter). A large amount
of experience dictates that shared mutable data, let alone unprotected
mutable data, is a bug waiting to happen.
On a more practical note, if you relaxing either 1) or 2) can cause major
problems with certain modern GCs, so at a minimum casts should be involved.
More information about the Digitalmars-d
mailing list