[dmd-concurrency] Vot de hekk is shared good for, anyway?
Walter Bright
walter at digitalmars.com
Thu Jan 7 22:58:01 PST 2010
Michel Fortin wrote:
> Le 2010-01-07 à 20:28, Walter Bright a écrit :
>
>
>> Having a per-thread gc is an optimization, not a fundamental feature of the concurrency model. For one thing, it precludes casting data to immutable. For another, it may result in excessive memory consumption as one thread may have a lot of unused data in its pool that is not available for allocation by another thread.
>>
>
> Both the "per-thread GC + shared GC" model and "the shared GC for everyone" model can be seen as optimizations. The first optimizes for speed, the second optimize for memory usage.
>
> Depending on what you do, it might even make sense to have some threads using the shared GC for everything and other having a thread-local GC to improve speed.
>
> If you want the language to be limited to models where the memory can always be shared between all threads, then that that's fine. It's your prerogative. I'm not so sure it's wise to limit shared semantics to this scenario just to avoid having the shared-immutable combo, but if you're sure that's what you want then I'll stick to it.
>
>
There's another aspect here. Consider all the problems we have getting
across the idea of an immutable type. What hope is there for shared? I
see mass confusion everywhere. Frankly, I see little hope of any but a
handful of programmers ever being able to grok shared and use it
correctly for concurrent programs. The notion that one can just slap
'shared' on a data type and have it work correctly across threads
without further thought is a pipe dream.
So what to do?
I want to pin the mainstream concurrency on message passing. The message
passing user never sees shared, never has to deal with locks, never has
to deal with memory barriers. It just works. Message passing should be a
robust, scalable solution for most users. I believe the Erlang
experience validates this. Go and Scala also rely entirely on message
passing (but they don't have immutable data, so their models are unsafe
and I predict many rude surprises).
So why bother with shared at all?
Because message passing does not cover all the bases, and D is supposed
to be a systems programming language. So we need a paradigm for
synchronization and shared data structures. What shared provides is:
1. A way to identify shared data. This is incredibly important. A lot of
sharing bugs come about because of inadvertant unrecognized sharing of
data. This should be pretty much impossible in D. Furthermore, if you do
have a sharing bug in your code, you look at the 1% of the data tagged
as shared, rather than every freakin' line of code and every piece of
data. Half the battle in debugging code is figuring out where to look
for the problem. Shared pares that problem down to a reasonable size.
2. Shared comes with a collection of static typing rules and guarantees
that will head off a number of concurrency bugs, such as sequential
consistency.
I view shared as sort of like the latest electric arc welders which
automatically adjust the current and wire feed for you. They
dramatically shorten (but don't eliminate) the learning curve for people
trying to master the art of welding. D is the only language to even
attempt this. C++ leaves you completely on your own, Java offers no
help, Erlang, Scala and Go throw in the towel and won't allow anything
but message passing.
As for a shared gc vs thread local gc, I just see an awful lot of
strange irreproducible bugs when someone passes data from one to the
other. I doubt it's worth it, unless it can be done with compiler
guarantees, which seem doubtful.
More information about the dmd-concurrency
mailing list