[dmd-concurrency] Thread termination protocol (shutdown protocol evolved)

Thu Jan 21 10:36:08 PST 2010

On Thu, 21 Jan 2010 13:11:17 -0500, Michel Fortin  
<michel.fortin at michelf.com> wrote:

> Here is another idea for the "shutdown protocol". I'm changing the name  
> to better reflect what the proposal is. Also take note that I've renamed  
> the "Shutdown" exception to "Terminated".
>
> It includes ideas from my previous proposal as well as from how Erlang  
> handles linked processes. Linked processes in Erlang define an error  
> handling mechanism, much like the one I'm proposing here. I was mistaken  
> before about how it worked and what it did. This time I've integrated  
> the concept correctly.
>
> Thank you for reading! This might take a while. :-)
>
>  - - -
>
> The thread termination protocol has two goals:
>
> * Establish a generic way of expressing when you want a thread to  
> terminate that can cover a majority of cases. But it's important that  
> cases not supported by it can still be handled by user-constructed  
> termination protocols.
>
> * Establish a generic way to handle thrown exceptions in spawned threads.
>
> So the thread termination protocol relies on four important points:
>
> 1. When spawning a thread, the parent thread is set as the owner of the  
> new one.
>
> 2. The owner link with the child thread can be broken by choosing  
> another thread as the owner. Setting the owner to the main thread means  
> that you don't want the child to be terminated until the program itself  
> terminates.
>
> 3. When a thread terminates, it sends a Terminated exception message to  
> each of the threads it owns.
>
> 4. When a child thread receives a Terminated exception message, the  
> thread can handle it and even ignore it if it wants. But in the absence  
> of corresponding message handlers and exception handlers, the thrown  
> exception will stop the thread.
>
> 5. When a thread terminates via an exception other than Terminated, the  
> exception is sent back as a message to the owner thread. In the absence  
> of corresponding message handlers and exception handlers, the thrown  
> exception will stop the owner thread and thus again send the exception  
> as a message to the owner's owner, until it reaches the main thread  
> (which has no owner).
>
> Here is an important thing: sending a Terminated exception must not  
> prevent the thread from receiving more messages afterwards. If the child  
> thread chooses to ignore the Terminated message then nothing prevents it  
> to continue receiving messages normally afterward. One reason for this  
> is that it might want to postpone termination to perform a closing  
> handshake with something else it is currently communicating with.
>
> Also important is that you can at any time send manually a Terminated  
> message to a thread when you want it to terminate.
>
> And we might want to add a Tid field to the Exception class to identify  
> the thread it originated from.
>
>  - - -
>
> Now, let's see how it works with various use cases. (This first case one  
> is pretty much a repetition of the one that came along with my previous  
> shutdown protocol proposal.)
>
> For the file copy example with an intermediate processing step, it's a  
> simple ownership graph:
>
> 	main -owns- read thread -owns- processing thread -owns- writer thread
>
> When main terminates, it sends Terminated to the read thread, which  
> ignores it because it's reading from a file. When the read threads  
> finish reading, it terminates and send a Terminated to the processing  
> thread which will receive it as its last message. When the processing  
> thread receives Terminated it terminates which automatically sends a  
> Terminated message to the writer thread. The writer threads then  
> terminates after writing the last part. At this moment the program  
> closes.
>
> What happens if the writer thread throws an exception (other than  
> Terminated)? The exception will terminate the writer thread, be sent  
> back as a message to the processing thread, which will terminate and  
> send the exception to the reader thread, which will terminate and send  
> the exception to the main thread, which will terminate the program. If  
> any of those threads in the middle of the chain is already terminated  
> when the exception is thrown, the exception is sent directly to the  
> owner's owner.
>
> Of course, any thread in the graph might catch the exception, preventing  
> it from percolating to other threads.
>
> So this simple case works well out of the box. That's because the graph  
> is a simple tree. If you have a thread spawning a child thread only to  
> then give it to another thread, then you'll probably want to decide  
> yourself when you want to terminate it and who should handle exceptions.  
> Here is how that should work:
>
> 1. Create your thread, setting ownership to the main thread.
> 2. Give the Tid to whoever you want.
> 3. ...
> 4. Send the thread a Terminated exception when you're done with it.
>
> Here the owner thread just acts as a safeguard in case you forget to  
> send a Terminated message manually. You can set the owner to any thread  
> that lives longer than the spawned thread, not necessarily the main  
> thread. When you know you want to terminate the thread, just send it a  
> Terminated exception.
>
> You might want to setup a special "monitoring" thread as the owner of  
> such child threads. This thread could catch exceptions leaking from  
> child threads and do some error handling.
>
>  - - -
>
> For the API, I propose this:
> 	
> 	spawn(function, args...)
> 	// creates a new thread having the spawning thread as the owner.
>
> 	spawnOwned(ownerTid, function, args...)
> 	// creates a new thread with a specific owner.
>
> 	tid.owner = ownerTid
> 	// Changes the owner of a thread.
> 	// Note 1: this needs to be protected against circular ownerships.
>
> 	terminate(tid);
> 	// Sends a Terminated exception to the thread. This only works for
> 	// threads listening for messages.
>
> This makes only two notable differences with Erlang:
>
> 1. You cannot have unlinked threads. This ensures that all threads  
> receive a Terminated message eventually (if they don't terminate by  
> themselves before that). This also make sure that uncaught exceptions  
> will always be propagated back to somewhere, right up to the main thread  
> if you don't catch them.
>
> 2. Sending a Terminated exception is a standard way to tell a thread to  
> just stop. I don't think there is such a thing in Erlang. Fortunately,  
> you don't have to obey the Terminated message if you don't want to, but  
> most likely you'll just want to postpone termination while you clean  
> things up.

Looks okay at first glance. To reduce namespace pollution: terminate(tid)  
-> tid.terminate. Also, overloading spawn and spawnOwned should also be  
considered. To clarify, the exception/terminate message passing are passed  
with the same priority as normal messages, so they only get re-thrown  
after prior messages are sent / received / etc. Correct?