Scope as reference-counting modifier; amnesic

Thu Feb 7 05:48:38 PST 2008

On 2008-02-05 23:45:42 -0500, Edward Diener 
<eddielee_no_spam_here at tropicsoft.com> said:

>>> Michel Fortin wrote:
>>> 
>>>> But there are still many holes in this scheme in which scope now means 
>>>> reference-counted. Take this example:
>>>> 
>>>>     class A {
>>>>         void doSomething() {
>>>>             globalReferences ~= this;
>>>>         }
>>>>     }
>>>>     scope class B { }
>>>> 
>>>>     A[] globalReferences;
>>>> 
>>>>     scope B b = new B; // Scope could be made implicit here, but it's 
>>>> irrelevant to my example
>>>>     b.doSomething();
> 
> I am lost about what you are saying above. Member functions have 
> nothing to do with 'scope'.

The thing is that if we have a static reference-counting type modifier 
(the scope keyword in this case), the compiler has to emit code to 
increment and decrement the reference count each time we add or remove 
a reference to a scope object. To do that, it has to know when 
compiling a function whether or not the object's type is scope.

In the above example, class A has a doSomething function which adds a 
reference to itself to some global variable. Since 'this' is of type A 
(not scope A) in the doSomething function, no code is added to 
reference-count the object when assigning it to the global variable. 
Hence, if you could call doSomething on a scope A, and then remove all 
other references to A, A's reference count would become zero and A 
would be deleted despite it being still referenced.

(Having a B class derived from A just makes the thing harder to spot. 
It's basically the same thing as having a scope A object though.)

The obvious solution is this:

    class A {
        void doSomething() {
            globalReferences ~= this;
        }
        scope void doSomething() { // scope is an attribute of the 
function here
            globalReferences ~= this;
        }
    }

where one doSomething has no code to maintain the reference counter and 
the other version has. The first version would be called when the 
compiler has a non-scope A while the second would be called for a scope 
A.

That essentially mean that you couldn't call a scope function on a 
non-scope object and vice-versa. Each function therefore needs to be 
duplicated, with a scope and a non-scope variant, just in case it puts 
a reference to the object somewhere that'll still exist after the 
function call.

There is an obvious solution to that problem though: as the D source 
code for the two member functions is the same, the compiler could just 
compile the two variants from the same source. Unfortunately, the 
compiler would *always* have to generate two symbols for each member 
function, even if no reference is put elsewhere so the scope function 
can be reached when calling from elsewhere. (Remember that when doing a 
function call the compiler doesn't know what happens inside the 
function, and it can't just guess the scope function doesn't exist.)

    class A {
		// automatically generates the two functions from the example above
        void doSomething() {
            globalReferences ~= this;
        }
    }

If we had a new keyword to tell in the function signature that we won't 
take keep the reference somewhere else, that it'll be completely 
forgotten after the call, then we could avoid generating two functions 
needlessly for most member functions. Let's call this keyword "amnesic":

    class A {
        amnesic void doSomething() {
            globalReferences ~= this; // illegal, amnesic reference to 
this put outside function scope
        }
        amnesic void doSomethingElse() {
			// only one generated function
        }
    }

It's basically the same pattern as for invariant and non-invariant 
methods (you can't call an invariant method on a mutable object; you 
can't call a mutable method on an invariant object; both work with 
const). Here, you have regular methods, scope methods, and amnesic 
methods can work with both scope and non-scope objects.

I'm going a little off-topic now, but a new keyword such as this could 
be useful for creating invariant objects too.

Basically, while an amnesic function guaranties there are no more 
references to the object after the function call than there were 
before, an amnesic constructor could guaranty uniqueness of the 
reference after the creation. This means the created object could 
become invariant if that was the caller's intent:

    class A {
        amnesic this() {
			// legal: no reference given to the outside world
        }
        amnesic this() {
            globalReferences ~= this; // illegal, amnesic reference to 
this put outside constructor's scope
        }
    }

	A a1 = new A;
	invariant A a2 = new A;

(Others have talked about "unique" as a keyword, but "unique" isn't 
very useful because it describes the state of a reference at a certain 
point in time, not a property of a variable or a type.)

You could also have amnesic parameters to functions that would guaranty 
the function doesn't keep a reference after the call.

-- 
Michel Fortin
michel.fortin at michelf.com
http://michelf.com/