Deterministic life-time storage type

Sat Apr 21 07:22:41 PDT 2012

Hi. I don't have time to follow all discussions here, but it makes a 
some time that I have an idea that I would like to share here, to know 
if that may interest programmers in various fields. The idea is major 
change in the langage (perhaps for D.3), to give tools for the compiler 
and the programmer to know what is the lifetime of any type or object in 
memory, to allow a better memory management. This could dramatically 
reduce the use of the GC, and even allow extra optimisations. It would 
also solve the cast-to-immutable problem, and perhaps even some 
r-value/l-value issues. However, this feature requires some discipline 
to use, and that's why I would like to know if people would be 
interested, or if it is too much to add to a programming langage.

Now, this is the idea in a few words: 
In each function signature, you can add information about whether the 
function may keep reference to its parameters or return value. Then, 
when you declare a variable, you can say how long you want to use that 
variable. With these information, the compiler can check you use your 
variables right, and use this information to destroy the variable at the 
right time.

To do this, I'll alter the meaning of the scope, in, out and inout 
keywords to create new storage type :

 - dynamic variable: this refers to a variable for which references can 
be freely taken. It is allocated on the heap, and garbage collected the 
usual way. This is the default, but an additional keyword, "dynamic" may 
be used to explicitely declare a dynamic variable.

 Example:
 | dynamic(int)[] a = [1, 2, 3]; // same as: int[] a = [1, 2, 3];
 | dynamic(int) b = 5; // same as: ref b = new int; b = 5;

 - scope variable: this refers to a variable for which we can be sure 
that no reference to the variable, or any subpart of it (scope is 
transitive), will survive the current scope. No dynamic reference of a 
scope variable can be made.

 Example:
 | int[] g;
 | 
 | void main()
 | {
 |    scope int[] a = [1, 2, 3]; // the allocated array can be destroyed 
 |                               // at the end of the current scope
 |    scope int b = 5; // same as: int b = 5; (exept that no closure are 
 |                     // allowed)
 |
 |    g = a[]; // error: no reference of a may escape main's scope.
 |  }

A specific scope, different from the current scope, can be specified by 
adding parentheses to the scope keyword:

 - scope(in): This scope is a bridge between scope and dynamic. 
Variables of any scope (including dynamic variables) can be cast to 
scope(in). External references of a scope(in) variable may exist, but no 
new references of a scope(in) variable that survives the current scope 
may be made. Several scope(in) variables usually do not share the same 
scope (use scope(label) for that).

 Example:
 | int[] g;
 | 
 | void main()
 | {
 |   int a[] = [1, 2, 3];
 |   scope int b[] = [4, 5, 6];
 |
 |   scope(in) int[] c = a[]; // ok
 |   c = b[]; // ok
 |   g = c[]; // error: no reference of c may escape main's scope.
 | }

 - scope(out): This scope is for variables to be returned. When a 
scope(out) variable is returned, the calling function can be sure that 
no reference of the variable or any of its subpart exist anywhere, but 
in the returned value itself. The caller may cast the scope(out) 
variable to any scope, and may even cast it to immutable. The caller 
"decides" what is the scope of the scope(out) variable.

 Example:
 | scope(out) int[] oneTwoThree()
 | {
 |   scope(out) int r = [1, 2, 3];
 |   return r;
 | }
 | 
 | void main()
 | {
 |   scope a = r;
 | };

 - scope(inout): A combinaison of scope(in) and scope(inout): No 
reference of the variable that survive the scope may be taken, but the 
returned value.

 Example:
 | scope(inout) int[] firstHalf(scope(inout) int[] a)
 | {
 |   return a[0..$/2];
 | }

 - scope(label) variable: variable shares its scope with
the variable or label "label".

 Example:
 | void main()
 | {
 |   scope a = [1, 2, 3];
 |   {
 |     scope(a) b = [3, 4, 5];
 |     a = b; //  ok, b has a's scope
 |   }

In addition, to make scope usage less verbose, we may make in, out, and 
inout parameters and return values implicitely scope(in), scope(out), 
and scope(inout) respectively, in addition to their current meanig, as 
long as code breakage is tolerable (do probably not before D.3 unless 
this proposal gets more approval than I expect).

This scope system is very similar to the mutable/immutable system. It is 
optionnal (one may code without it). There is transitivity, a bridge 
type (const or scope(in)), and also the same virality (is this an 
english word??). This means that to be usable, this system requires to 
restrict the usage of parameters and returned value of the functions by 
appropriate keywords (scope(in, out or inout), otherwise a scoped 
variable can't be passed to a function and is not usable in practice. 
But in my opinion, the gain is very large. When used, variable lifetimes 
becomes deterministic, the compiler can destroy them at the right time, 
and use the GC only when necessary, or with global variables.

I only gave here a few definitions, from which a whole scope system can 
be deduced, and implemented. I've given it more thoughts, but this post 
is long enough for now, so I will let you give me your thoughts, and 
gladly answer your questions about subtelity that may arise, 
feasibility, etc.

-- 
Christophe Travert