[dmd-concurrency] shared arrays

Thu Jan 14 13:07:21 PST 2010

----- Original Message ----
> From: Sean Kelly <sean at invisibleduck.org>
> 
> Hm... I hadn't planned to add a wait() call for stuff exposed by spawn, but I 
> suppose it's a logical extension of watch(tid) (ie. "please notify me when this 
> thread exits"), which we were going to provide.  Adding a wait() wrapper for 
> this would be trivial.

I wasn't requesting that, I just didn't know the planned API :)  It was pseudocode.

> > 
> On Jan 14, 2010, at 12:20 PM, Steve Schveighoffer wrote:
> > The problem is, the compiler doesn't know with an array of items whether it's 
> the array that must be atomic or the elements that must be atomic, or some other 
> relationship (such as a group of elements are related as in utf-8 code points).  
> It should either refuse copying any data, or allow copying any data.  Making a 
> decision based on assumptions of the array semantic meaning doesn't seem right 
> to me.
> 
> D allows non utf-8 data in a char[], so I don't see any reason for it to try and 
> guarantee any meaningful result from such an operation.

In all cases I've seen, D tries to treat char[]'s as string types.  I don't see why shared(char)[] types should be any different.  In the context of strings, the whole thing is the type, not the individual characters.  To treat it differently is to ask for trouble.  I've also seen many mentions of "don't use char for anything but utf-8 data.  Use ubyte for everything else."

I don't have a problem with the compiler allowing copying of strings, even if it results in weird data.  But it should be consistent and allow copying of other array types too (or any size struct).  In other words, the compiler should not make assumptions -- it admits that it's unsure what you want and based on those grounds either a) it allows you to copy anything (I guess I trust you, you are the programmer) or b) it refuses to copy anything (I assume you don't know what you are doing unless you use casting or have locked something).

>  Earlier, I had been 
> thinking it might be nice to have this though:
> 
> shared(char)[] a, b;
> 
> synchronized( lock( a, b ) ) {
>     // some fancy algorithm on a and b
> }
>
> Basically, use the hashtable of mutexes discussed earlier to allow users to 
> obtain locks on a set of N arrays in a safe manner (because expecting them to do 
> it manually will generally result in deadlock).  This makes what's happening 
> explicit and allows the whole mess to be handled in library code.  In theory, 
> this same approach could work for any reference type.  The optimization issue 
> would be making gc_query() not need to obtain the GC lock to return a valid 
> result (this may be safe already, I haven't spent the time to figure it out).

If the compiler only allowed operations on arrays if there was a lock held, I would be OK with that too.  That goes nicely with the "refuses to copy anything" idea.

The tough part then becomes, how do you associate a lock with data.  That is, let's say I have something like:

class A
{
    long[] myArray;
    synchronized foo() {...}
}

shared(long)[] globalarray;

void main()
{
   auto a = new shared(A);
   synchronized(a) globalarray = a.myArray;
   ...
}

Now, can two threads simultaneously access the data used by a.myArray and globalarray?  If so, then does that mean that A.foo must also acquire the 'magic array' lock on myArray?  It looks to me like you would have to, which kind of defeats the purpose of having to lock a to access myArray.

-Steve