asynchronous communication between threads

Thu Jan 3 11:34:33 PST 2013

03-Jan-2013 22:38, Charles Hixson пишет:
> On 01/03/2013 08:40 AM, Dmitry Olshansky wrote:
>> 02-Jan-2013 03:54, Charles Hixson пишет:
>>> If I were to use the below as an asynchronous communication channel,
>>> would it avoid deadlocks (presuming that only Cell called Msg) and that
>>> when a thread activated Cell, the first thing it did was process it's
>>> mailbox?
>>> Also, if only around 7 cells were created in the main thread, would RAM
>>> usage remain reasonable (i.e., when a thread was through with an MCell,
>>> would only one copy continue to exist in RAM? Or would there be a copy
>>> for each thread that had ever referenced it? Should MCell instances be
>>> marked shared?
>>
>> That's a lot of questions to ask and currently I hardly can decipher the
>> whole logic from this and the code below alone, too much is behind the
>> scenes.
>>
>> I'm willing to help out if you could post a more complete example or
>> more accurate questions as e.g. "created 7 cells in main thread" could
>> be done very differently.
>>
>> Also a nit at this size of code (and for me it's unformatted) I'd
>> recommend linking to the code on some paste service e.g. dpaste
>> http://dpaste.dzfl.pl/.
>>
>> [snip]
>>
> I'm trying to design a program (I'm not writing it yet) that requires
> asynchronous communication between LOTS of separate "cells".  You can
> think of it as vaguely like a neural network, but it doesn't fit the
> definition, that's just an analogy.  I thought that D would be a good
> language to do this in, but all the built-in approaches seem to be
> "overly protective", locking a lot more than I want and yielding
> deadlocks, or copying more than I want, and ensuring that I end up with
> stale data.  The approach that I've come up with to avoiding this is to
> attach a mailbox to each "cell".  In an ideal world, each cell would be
> a separate thread, but as there will be at least hundreds of thousands
> of them, and likely millions, that's unreasonable.  So I need the
> threads to cycle through pools of "cells".  Each cell may send messages
> to any other cell (not really, but you don't know in advance what the
> connections will be).

So cell is in fact a task (more common name for it I think) with a 
mailbox and you want threads multiplexed across these tasks. The task is 
running some code/function/callback/whatever that periodically polls a 
mailbox & puts stuff in other task's mailboxes. So far good?

Then definitely take a look at Fiber in druntime (it's core.Fiber AFAIK).

>
> That skeleton of code was intended to show the idea of cells isolated
> from outside attached to a mailbox, which has blocking access from the
> outside.  The cells should never block, but this is largely because they
> are only directly accessed by the thread within which they run.  The
> mailboxes can receive messages from anyone, but their processes are so
> short, that blocking will be extremely brief.

I'm sure not so optimistic about locking. Even though it's brief there 
are many threads that may be putting stuff  simultaneously into 
mailboxes thus contending on the lock and causing context switches.
+ the lock/unlock of a mutex is not free. The lock-based message queue 
is nice for a start (less _subtle_ bugs) but you sure got to look into 
lock-free alternatives later on.

> I wanted to know if this proposed design would work, as in not getting
> into deadlocks, not blocking excessively, and not giving me excessively
> stale data.

The the crucial part is missing - taking a message out of the mailbox ;)

But anyway let's focus on the details. 2 classes and 2 functions. Cell's 
send & MBox's receive.

Let's suppose we have 2 Cells A & B and their mboxes MA & MB.
 From the code I see (that's full of typos? MCell & Cell are used 
interchangeably) the chain of event for full send from A --> B is:

1. A Cell's send locks cell A. (send is sync-ed)
2. It locks target cell B.
3. It then locks its mailbox MB.
4. undoes all the locks backwards.

Then there is of course a deadlock bound to happen if B is sending 
message in opposite direction, e.g. :
1. A locks A (making its first step)
2. B lock B (ditto)
3. A locks B & blocks
3. B locks A & blocks

that if for instance there is step 2. So I guess you haven't meant the 
step 2.

If there is no lock of the cell except before sending then it looks 
legit as there are 2 locks protecting separate entities:

- one is to protect message queue on putting message into it

- the other one is to protect ... what exactly? send is already 
implicitly guarded by target queue's lock.

So I'd say you only need to guard the message queue and that's about it. 
The only other concern is properly scheduling the execution of tasks (or 
your cells) so that one runs on no more then one thread at any given time.

In simplest case just make a locked (or lock-free) queue of these and 
let threads pick cells/tasks from it and put back when they are done.

Far better is a large bounded queue that you never ever remove/put stuff 
into. It's just a big array of pointers/references to task. A thread 
then just goes around it looking for valid/ready entries (they stay 
allocated so no nulls there) and executes them. That goes firmly into 
lock-free zone to make it correctly synchronized though but with a bit 
of care it should be doable.

The second one also can be done with locks. In this case a thread goes 
through all of tasks/cells and tries to lock them (That's where your 
lock around it comes in, is it?). If it locks - cool, work on it, if not 
- try the next one.

 > More details aren't available, because I didn't want to
> commit to this design if the basic design was wrong, so I haven't
> written them.  It has been suggested that since the cell will only be
> accessed by the thread, it doesn't need to be synchronous.
>

> I'm really nervous about how many copies of the cell will exist,
> however.  Since there are going to be so many of them, if I ended up
> with a cell/thread, the system would bog down in unused storage.  But
> the mailbox needs to be globally accessible for the scheme to work.

Everything on the heap is accessible from other threads, provided they 
have the pointer to the object in question.

> N.B.:  When a cell receives a message from another cell it's likely, but
> not guaranteed, to send a response back.  It may also send responses
> onwards.  And the number of cells isn't fixed, nor is the number of
> their connections.  (This is less important in D than in many other
> languages, but even in D it affects serialization.)
>
> FWIW, I've been told that an approximately similar design has worked in
> Ada, though the design was written in Ada-83.  (In Ada the mailbox was
> protected, which I take to be approximately the same as synchronized.)

In general there are ways to make it fly. The tricks to use depend on 
the use case and what is bottleneck (I/O or the CPU time).
The pain points is faster mailboxes and better scheduling (as in less 
context switches for nothing, faster turn-around time etc.).

-- 
Dmitry Olshansky