isolated/owned would solve many problem we face right now.

deadalnix deadalnix at gmail.com
Fri Oct 11 13:42:08 PDT 2013


You know a idea is good when it can solve a large variety of 
problems and is by itself of limited complexity. I want to 
propose of of these idea today.

Everything start with this paper : 
http://research.microsoft.com/pubs/170528/msr-tr-2012-79.pdf

The paper state how microsoft is experimenting with immutable and 
isolated in C#. We already have immutable. However, we do not 
have isolated, and, strangely enough, adapted to D it can solve a 
unexpectedly large variety of issue, several of them which do not 
even exists in C#.

Let me start to sum up some of the advantages it can bring to Dn 
then I'll explains how this would work in D :
  - Transfer of ownership of data from one thread to another.
  - Convenient construction of immutable or shared objects.
  - Lock on a subtree of objects.
  - Reduce friction to use RefCounted safely.
  - Ensure safety of std.parallelism.
  - Opportunity for the compiler to insert explicit free and 
reduce GC pressure.
  - Avoid necessary array copy when slicing.
  - More optimization opportunities via alias analysis.

The whole thing at a moderate cost. We need to add an extra type 
qualifier. This isn't as bad as it seems because this type 
qualifier do NOT compose with other type qualifiers the regular, 
so we do not have a combinatorial explosion of cases to consider. 
The cost may still seems high, but considering it solves a large 
variety of recurring problem we have while adding more 
optimization opportunities, I do think it is worthwhile to pay.

This qualifier can probably be inferred in many situation by the 
compiler, but not always. For now I'll call it isolated, to mimic 
C#. I do think that owned is a nice alternative, but let's not 
make discussing that name the meat of the thread.

Before going further, I now need to introduce the concept of 
island. An island is a group of objects that can refers each 
other as well as immutable objects, they also can refers other 
island, as long as the island graph is acyclic (which is ensure 
as long as the type system isn't broken). User can have only one 
reference to one object in the island.

The compiler need to keep track of islands, but they do not need 
to be expressed explicitly. At some point in the program, a 
island can be merge with the shared, TL or immutable heap. Then 
the island cease to exist.

Each explicit use of isolated mean a new island. Each assignation 
of an isolated to something else means that the island is merged :
isolated Foo a = ...;
immutable b = a; // The island referred by a is promoted to 
immutable.

isolated Foo a = ...;
auto b = a; //  The island referred by a is promoted to Thread 
local.

isolated Foo a = ...;
struct Bar {
     Foo field;
}
isolated Bar b = ...;
b.field = a; // a's island is merged in b's island.

When an island is merged, all references to that island are 
invalidated. It means that a is not usable anymore after each 
assignations in the samples presented. It must be noted that 
passing a as a function argument will have the exact same effect.

As a result, when you manipulate an isolated, you know you are 
the only one to get a reference to it (or the type system has 
been broken).

isolated is transitive, but you'll find some subtleties compared 
to other qualifiers. Let's see how it goes :
class A {
     B mfield;
     isolated B ifield;
}

A a;
a.field; // This is isolated.

immutable A a;
a.ifield; // This is immutable.

shared A a;
a.ifield; // This is isolated.

isolated A a;
a.mfield; // This is isolated. Same island as a.
a.ifield; // This is isolated. Different island than a.

Now we have isolated object and can assign to them. But we need 
to do the operation the other around. The only way to do that is 
to swap.

A a;
B b = a.ifield; // Error, as a.ifield is not a local.
A a;
B b;
swap(b, a.ifield);  // OK, you can swap isolated.

That imply to add some unsafe black magic into swap, but it is 
100% safe to use from outside.

To be complete we need to discuss how isloated and postblit 
interact. When an isolated is passed to another isolated, 
previous references are invalidated. This means that the postblit 
do not run on isolated structs.

As the mechanism has been explain, I can explicit how it solves 
the problems mentioned above :
  - Transfer of ownership of data from one thread to another.
std.concurency can now propose function taking isolated in the 
interface. As passing a isolated as argument invalidate other 
isolated from the same island, the thread sending data 
effectively loose its ownership and can't use the data anymore.

  - Convenient construction of immutable or shared objects.
Objects can be constructed as isolated and then merged to 
immutable or shared heap.

  - Lock on a subtree of objects.
If an object own some of its subdata, it should mark them as 
isolated. Now we do not need to recursively lock on a shared 
object to use its owned internal.

  - Reduce friction to use RefCounted safely.
RefCounted can take an isolated reference when constructed. It 
can blast the whole island (and all island referred by that 
island) when the RefCount goes to 0. That make RefCounted safer 
to construct and more powerfull on what memory it can control.

  - Ensure safety of std.parallelism.
isolated(Stuff)[] can be used to run a processing on each item of 
the slice in parallel, 100% safe.

  - Opportunity for the compiler to insert explicit free and 
reduce GC pressure.
When an island is not invalidated when it goes out of scope, the 
compiler can blast the whole island and referred island.

  - Avoid necessary array copy when slicing.
It is always safe to append to an isolated slice. No need to 
allocate and copy new arrays.

  - More optimization opportunities via alias analysis.
Items from different island cannot alias each other.


More information about the Digitalmars-d mailing list