[dmd-concurrency] synchronized, shared, and regular methods inside the same class

Wed Jan 6 07:09:19 PST 2010

Le 2010-01-06 à 7:52, Álvaro Castro-Castilla a écrit :

> 
>>> that would create a variable doing the work of "y", but in case you
>>> want to specify the operations.
>>> 
>>> atomic {
>>> int y = x;
>>> y++;
>>> x = y;
>>> }
>> 
>> This syntax I don't like. While enclosing one operation in an atomic block
>> is fine, here it looks as though all those actions are done in one atomic
>> operation, which isn't at all the case. How hard is it to write this
>> instead:
>> 
>>       int y = atomic(x);
>>       y++;
>>       atomic(x) = y;
> 
> Just to clarify better what I meant:
> atomic{} would be supported from the compiler for defining critical regions
> as transactions. It would group the whole transaction "atomicIncrement". I
> understand that generalizing this would be hard, or maybe not possible if
> you are calling functions inside that code. However, allowing only calling
> pure functions might be doable.
> 
> Anyway, I just pointed this thing out, because the whole debate heads to
> Message Passing as the main way to do concurrency in D, leaving traditional
> threads and especially Software Transactional Memory behind. But how
> easy/possible would be to implement as a library? (just a question)

Ok, so I understand that you meant "atomic { int y = x; y++; x = y; }" to be some kind of transaction, where the whole becomes atomic. I'd be very happy to see transactional memory in D. I have experience with database transactions, but I'm not sure how familiar I am with the topic of software transactional memory.

Anyway, let's talk about it. I see two ways to implement transactions: with some sort of copy on write, where everything you touch in the transaction is copied: when successful you commit and it replaces existing data with your modified copy, on failure (when someone concurrently changed something used in the transaction) you discard the result and start again. Another is to acquire locks on everything you touch: when successful you commit and release the lock, and on failure (most likely a deadlock) you undo every changes and start again. You can even mix these two methods together.

I don't think the language should impose any particular way of doing transactions. In fact, I'd even say any standard transaction machinery in the language should integrate nicely with external transactional engines (databases, mostly).

So here is a simple but still verbose way to implement a transaction:

	void doTransaction(void delegate() transaction) {
		bool done = false;
		while (!done) {
			try {
				transaction();
				done = true;
			}
			catch (ConcurrentAccessException e)
				done = false;
		}
	}

	void myTransaction() pure synchronized {
		auto saved_x = x;
		x = createNew();
		scope(failure) x = saved_x;

		// should detect deadlocks and throw a ConcurrentAccessException
		syncrhonized (z) { 
			void delegate() z_rollback = z.add(x);
			scope(failure) z_rollback();

			// rest of transaction goes here.
		}
	}

	doTransaction(&myTransaction);

Here, if synchronization fails for z in myTransaction because of a deadlock, a deadlock exception is thrown, x is reverted to its previous state, and the transaction is attempted again. Note that it's important that all functions in a transaction be pure: they should only affect the argument you pass to them, otherwise you can't really restore their state.

If the compiler was to help with something, I'd say it should define a "transactional" attribute for methods, which would rollback everything upon failure. Basically, that's the strong exception guaranty: if an exception occurs, everything is left in the same state as it was before. Additionally, you need to be able to rollback the changes later after the function call, so a transactional function would needs to give a rollback delegate to its caller (like z.add() does above).

That'd be quite interesting. Unfortunately, I doubt there will be enough time for anything like that.

> I guess
> threads will be kept, but STM is not being mentioned. My point is that there
> are situations where MP is not the best/easiest solution. For instance, for
> multi-agent simulations, you could think that MP would work well. However,
> when you want to visualize the simulation you need to duplicate the number
> of messages and send them to a central thread for processing the whole data
> and visualize it. With normal threading this would be the typical "dirty
> flag", stop the simulation and bring the data into the vis. thread; with STM
> you could just read a snapshot of the data and visualize it with no need for
> locks.

Your comparison of normal threading and STM is interesting. You say: "with STM you could just read a snapshot of the data". This implies that an immutable snapshot of the data exists, which implies some sort of copy-on-write where the simulation copies the data every time it changes. It won't stop the simulation in order to access the data, but it'll likely make it perform slower with all those allocations. It's really a tradeoff between data immutability and mutability. To access mutable data, you need a lock; not so for immutable data.

So while I agree that supporting transactions would be great, I'm not sure how great it'd be for your use case. Especially since if you allocate a lot of memory for immutable data, the GC will kick in more often and lock the world anyway.

Message passing, where you could subscribe and unsubscribe to various parts of the simulation data, could be a good solution because then the simulation thread has to create copies of only what has been requested.

-- 
Michel Fortin
michel.fortin at michelf.com
http://michelf.com/