Some missing things in the current threading implementation

Sun Sep 12 07:28:38 PDT 2010

== Quote from Sönke_Ludwig (ludwig at informatik.uni-luebeck.de)'s article
> Now, however, after TDPL has been released and there is some
> documentation + std.concurrency, the system should be in a state where
> it is actually useful and only some bugs should be there to fix - which
> does not include inherent system changes. The reality is quite different
> once you step anywhere beside the already walked path (defined by the
> book examples and similar things).

std.concurrency takes the point of view that simplicity and safety should come
first, and performance and flexibility second.  I thoroughly appreciate this post,
as it gives ideas for either improving std.concurrency or creating alternative models.

> I apologize for the length of this post, although I already tried to
> make it as short as possible and left out a lot of details.

No need to apologize, I think it's great that you're willing to put this much
effort into it.

> 1. spawn and objects
> 	Spawn only supports 'function' + some bound parameters. Since taking
> the address of an object method in D always yields a delegate, it is not
> possible to call class members without a static wrapper function. This
> can be quite disturbing when working object oriented (C++ obviously has
> the same problem).

Except in the case of an immutable or shared object this would be unsafe, as it
would allow implicit sharing.  I do agree, though, that delegates need to be
allowed if they're immutable or shared delegates.  Right now taking the address of
a shared/immutable member function doesn't yield a shared/immutable delegate.
There are bug reports somewhere in Bugzilla on this.

> 2. error messages
> 	Right now, error messages just state that there is a shared/unshared
> mismatch somewhere. For a non-shared-expert, this can be a real bummer.
> You have to know a lot of implications 'shared' has to be able to
> correctly interpret these messages and track down the cause. Not very
> good for a feature that is meant to make threading easier.

Agreed.  Whenever you run into an unreasonably obtuse error message, a bug report
would be appreciated.  Bug reports related to wrong or extremely obtuse error
messages are considered "real", though low priority, bugs around here.

> 4. steep learning curve - more a high learning wall to climb on
> 	Resulting from the first points, my feeling tells me that a newcomer,
> who has not followed the discussions and thoughts about the system here,
> will see himself standing before a very high barrier of material to
> learn, before he can actually put anything of it to use. Also I imagine
> this to be a very painful process because of all the things that you
> discover are not possible or those error messages that potentially make
> you banging your head against the wall.

True, but I think this is just a fact of life when dealing with concurrency in
general.  Gradually (partly due to the help of people like you pointing out the
relevant issues) the documentation, etc. will improve.

> 5. advanced synchronization primitives need to be considered
> 	Things such as core.sync.condition (the most important one) need to be
> considered in the 'shared'-system. This means there needs to be a
> condition variable that takes a shared object instead of a mutex or you
> have to be able to query an objects mutex.

The whole point of D's flagship concurrency model is that you're supposed to use
message passing for most things.  Therefore, lock-based programming is kind of
half-heartedly supported.  It sounds like you're looking for a low-level model
(which is available via core.thread and core.sync, though it isn't the flagship
model).  std.concurrency is meant to be a high-level model useful for simple, safe
everyday concurrency, not the **only** be-all-and-end-all model of multithreading
in D.

> 6. temporary unlock
> 	There are often situations when you do lock-based programming, in which
> you need to temporarily unlock your mutex, perform some time consuming
> external task (disk i/o, ...) and then reaquire the mutex. For this
> feature, which is really important also because it is really difficult
> and dirty to work around it, needs language support, could be something
> like the inverse of a synchronized {} scope or the possibility to define
> a special kind of private member function that unlocks the mutex. Then,
> inside whose blocks the compiler of course has to make sure that the
> appropriate access rules are not broken (could be as conservative as
> disallowing access to any class member).

Again, the point of std.concurrency is to be primarily message passing-based.  It
really sounds like what you want is a lower-level model.  Again, it's available,
but it's not considered the flagship model.

> 7. optimization of pseudo-shared objects
> 	Since the sharability/'synchronized' of an object is already decided at
> class definition time, for performance reasons it should be possible to
> somehow disable the mutex of those instances that are only used thread
> locally. Maybe it should be necessary to declare objects as "shared C
> c;" even if the class is defined as "synchronized class C {}" or you
> will get an object without a mutex which is not shared?

Agreed.  IMHO locks should only be taken on a synchronized object if its
compile-time type is shared.  Casting away shared should result in locks not being
used.

> 9. unique
> 	Unique objects or chunks of data are really important not only to be
> able to check that a cast to 'immutable' is correct, but also to allow
> for passing objects to another thread for computations without making a
> superfluous copy or doing superfluous computation.

A Unique type is in std.typecons.  I don't know how well it currently works, but I
agree that we need a way to express uniqueness to make creating immutable data
possible.

> 11. holes in the system
> 	It seems like there are a lot of ways in which you can still slip in
> non-shared data into a shared context.
> 	One example is that you can pass a shared array
> 	---
> 		void fnc(int[] arr);
> 		void fnc2(){
> 			shared int[] arr;
> 			spawn(&fnc, arr);
> 		}
> 	---
> 	compiles. This is just a bug and probably easy to fix but what about:

Definitely just a bug.

> 	---
> 		class C {
> 			private void method();
> 			private void method2(){
> 				spawn( void function(C inst){ inst.method(); }, this );
> 			}
> 		}
> 	---

Just tested this, and it doesn't compile.

> 	II. Implementation of a ThreadPool
> 		The majority of applications can very well be broken up into small
> chunks of work that can be processed in parallel. Instead of using a
> costly thread-create, run task, thread-destroy cycle, it would be wise
> to reuse the threads for later tasks. The implementation of a thread
> pool that does this is of course a low-level thing and you could argue
> that it is ok to use some casts and such stuff here. Anyway, there are
> quite some things missing here.

My std.parallelism module that's currently being reviewed for inclusion in Phobos
has a thread pool and task parallelism, though it is completely unsafe (i.e. it
allows implicit sharing and will not be allowed in @safe code).  std.concurrency
was simply not designed for pull-out-all-stops parallelism, and pull-out-all-stops
parallelism is inherently harder than basic concurrency to make safe.  I've given
up making most std.parallelism safe, but I think I may be able to make a few
islands of it safe.  The question is whether those islands would allow enough
useful things to be worth the effort.  See the recent safe asynchronous function
calls thread.  Since it sounds like you need something like this, I'd sincerely
appreciate your comments on this module.  The docs are at:

http://cis.jhu.edu/~dsimcha/d/phobos/std_parallelism.html

Code is at:

http://dsource.org/projects/scrapple/browser/trunk/parallelFuture/std_parallelism.d

> 	III. multiple threads computing separate parts of an array
> 		Probably the most simple form of parallelism is to perform similar
> operations on each element of an array (or similar things on regions of
> the array) and to do this in separate threads.
> 		The good news is that this works in the current implementation. The
> bad news is that this is really slow because you have to use atomic
> operations on the elements or it is unsafe and prone to low-level races.
> Right now the compiler checks almost nothing.

Also in the proposed std.parallelism module, though completely unsafe because it
needs to be fast.