std.concurrency and efficient returns

Sun Aug 1 12:28:06 PDT 2010

On 01/08/2010 21:25, awishformore wrote:
> On 01/08/2010 19:17, dsimcha wrote:
>> == Quote from Jonathan M Davis (jmdavisprog at gmail.com)'s article
>>> Okay. From what I can tell, it seems to be a recurring pattern with
>>> threads that
>>> it's useful to spawn a thread, have it do some work, and then have it
>>> return the
>>> result and terminate. The appropriate way to do that seems to spawn
>>> the thread
>>> with the data that needs to be passed and then using send to send
>>> what would
>>> normally be the return value before the function (and therefore the
>>> spawned
>>> thread) terminates. I see 2 problems with this, both stemming from
>>> immutability.
>>
>> I think the bottom line is that D's threading model is designed to put
>> safety and
>> simplicity over performance and flexibility. Given the amount of bugs
>> that are
>> apparently generated when using threading for concurrency in
>> large-scale software
>> written by hordes of programmers, this may be a reasonable tradeoff.
>>
>> Within the message-passing model, one thing that would help a lot is a
>> Unique type
>> that can be implicitly and destructively converted to immutable or
>> shared. In D
>> as it stands right now, immutable is basically useless in all but the
>> simplest
>> cases because it's just too hard to build complex immutable data
>> structures,
>> especially if you want to avoid unnecessary copying or having to rely
>> on casts and
>> manually checked assumptions in at least small areas of the program.
>> In theory,
>> immutable solves tons of problems, but in practice it solves very few.
>> While I
>> don't understand shared that well, I guess a Unique type would help in
>> creating
>> shared data, too.
>>
>> There are two reasons for using multithreading: Parallelism (using
>> multiple cores
>> to increase throughput) and concurrency (making things appear to be
>> happening
>> simultaneously to decrease latency; this makes sense even on a
>> single-core
>> machine). One may come as a side effect of the other, but usually only
>> one is the
>> goal. It sounds like you're looking for parallelism. When using
>> threading for
>> parallelism as opposed to concurrency, this tradeoff of simplicity and
>> safety in
>> exchange for flexibility and performance doesn't work so well because:
>>
>> 1. When using threading for parallelism instead of concurrency, it's
>> reasonable
>> to do some unsafe stuff to get better performance, since performance
>> is the whole
>> point anyhow.
>>
>> 2. Unlike the concurrency case, the parallelism case usually occurs
>> only in small
>> hotspots of a program, or in small scientific computing programs. In
>> these cases
>> it's not that hard for the programmer to manually track what's shared,
>> etc.
>>
>> 3. In my experience at least, parallelism often requires finer grained
>> communication between threads than concurrency. For example, an OS
>> timeslice is
>> about 15 milliseconds, meaning that on single core machines threads
>> being used for
>> concurrency simply can't communicate more often than that. I've
>> written useful
>> parallel code that scaled to at least 4 cores and required
>> communication between
>> threads several times per millisecond. It could have been written more
>> efficiently w.r.t. communication between threads, but it would have
>> required a lot
>> more memory allocations and been less efficient in other respects.
>>
>> While I completely agree that message passing should be D's
>> **flagship** threading
>> model because it's been proven to work well in a lot of cases, I'm not
>> sure if it
>> should be the **only** one well-supported out of the box because it's
>> just too
>> inflexible when you want pull-out-all-stops parallelism. As Robert
>> Jacques
>> mentioned, I've been working on a parallelism library. The code is at:
>>
>> http://dsource.org/projects/scrapple/browser/trunk/parallelFuture/parallelFuture.d
>>
>>
>> The docs are at:
>>
>> http://cis.jhu.edu/~dsimcha/parallelFuture.html
>>
>> I've been thinking lately about how to integrate this into the new
>> threading
>> model, as it's currently completely unsafe, doesn't use shared at all,
>> and was
>> written before the new threading model was implemented. (core.thread
>> still takes
>> an unshared delegate). I think before we can solve the problems you've
>> brought
>> up, we need to clarify how non-message passing based multithreading
>> (i.e. using
>> shared) is going to work in D, as right now it is completely unclear
>> at least to me.
>
> I completely agree with everything you said and I really dislike how D2
> currently seems to virtually impose an application architecture based on
> the message passing model if you don't want to circumvent and thus break
> the entire type system. While I do agree that message passing makes a
> lot of sense as the default choice, there also has to be well
> thought-out and extensive support for the shared memory model if D2 is
> really focusing on the concurrency issue as much as it claims.
>
> Personally, I've found hybrid architectures where both models are
> combined as needed to be the most flexible and best performing approach
> and there is no way a language touted to be a systems language should
> impose one model over the other and stop the programmer from doing
> things the way he wants.
>
> /Max

P.S.: I find this to be especially true when taking into account the 
pragmatic approach under which D is supposed to be designed. D2 sounds a 
lot more idealistic than pragmatic, especially when it comes to 
concurrency, and I find that to be a very worrisome development.

/Max