[dmd-concurrency] Sending mesages to non-listening threads

Wed Jan 13 07:58:54 PST 2010

On Jan 13, 2010, at 12:48 AM, Kevin Bealer wrote:
> 
> I've tried to implement request / response on top of a system that happens to use blocking queues.  Come to discover, that this pattern results in deadlock.  A 'server' actor can fill up with messages on its input queue and its output queue.  If any of the actors writing requests into the server are also blocked waiting for responses, you can get a deadlock where A is writing a response but B can't get it because as part of processing another new incoming request, it has blocked on writing another request to A.

We'll have a timeout feature that will cause a blocking receive to abort after a set interval.  This is necessary to allow other work to be done by a thread anyway.

> I think blocking is kind of like reference counted memory -- you don't want to do it unless there is a way to break the cycles in the graph.  In this case that would mean a few queues that are not capable of blocking (they would be the 'ever expanding' kind), in the right places.  (I'm not sure if this is enough.  Even if it is enough in theory it might not be enough in practice as the system might grind to a halt if things expand too much.)
> 
> An idea that I think is likely to work better is to prioritize actors reading from large input queues over actors reading from small ones, or else prioritize readers from large queues over writers to them.  Depending on circularity in the design, it might be possible to prioritize "later" processes over "earlier" ones.  None of this is fool proof of course.

Good point.  And for people who really care about deadlocks, the CSP algebra (and others) makes it possible to mathematically prove that a messaging network is deadlock-free. I'm not 100% positive that this algebra works for an asynchronous model that spans a network (CSP does mean "communicating /sequential/ processes"), but last I checked there was work being done in this area.

> Suppose you have three processes, A, B, and C, that process each piece of data in that order, and C is slower than the others, and you have at least four CPUs working on the problem.
> 
> My hunch is that you will not be able to avoid an ever expanding queue between B and C, or else wasted CPU time, no matter what you do at the framework level.  If there is circularity as well, you may get an ever expanding queue, wasted CPU, or deadlock.  If there is enough circularity and/or general complexity in the overall message passing 'graph' you may not even be able to reason about what is going to happen.
> 
> You could send messages "backward" along the chain saying "please stop until I tell you otherwise".  That approach could be used to resist this problem.  (NOTE: If you have these meta-messages, be sure to check for them before checking for more incoming elephants that need to be turned into hot dogs.)

Yeah, one option is to handle this at the protocol level.  I agree that we'll likely need some options for mailbox maintenance as well though.  It's easy enough to clear out a queue by looping on receive((Variant) {...}), but you're right that it may be nice to suspend a queue, etc, in some instances.  This sort of thing is often necessary in network server programming, so there's some justification for it in practice as well.  My primary interest now is finding a solid set of primitives that can be built upon as need dictates.  It's easy enough to add a feature later, but having to change one just stinks.

> --> This assumes you can tell how full your own mailbox is!

Hm... I guess it may be worth looking at MQueue packages for tips as well.  We're mostly building a transport mechanism here, but there obviously may be a few features we could add to simplify implementing things on top of it.  I'll need to check Erlang again as well to see how they handle this.

> And one downside of this application-controlled volume throttling is that the system can probably no longer reason about deadlocks at all.  Since the messages going "backward" are always part of a two-way flow, there is no way for the system to automatically tell which end of the animal is the head and which is the tail.
> 
> This suggests that different policies might be available for each queue... (does anyone have experience with a system that has such configurable queues?)

Not I, though MQueue would be someplace to look.  I don't want to get too fancy about queue implementation though, if we can avoid it.  One obvious issue with throwing away arbitrary messages is that it becomes impossible to reason about the behavior of the system.  But since this would be under user control, I guess that's their choice and their problem :-p

> My advice in general is that this kind of message passing system can't be completely encapsulated.  There are failure states, the application designer *needs* to be able to measure the system to gauge whether you are about to stumble into one, and in general, designing the message passing control and recovery behavior is part of designing your system.
> 
> And it is probably one of the next big "it's hard to get right" topics for concurrent programming.

You're probably right.