Newbie initial comments on D language - scope

Mon Feb 4 05:21:47 PST 2008

On 2008-02-03 10:42:03 -0500, Edward Diener 
<eddielee_no_spam_here at tropicsoft.com> said:

> Michel Fortin wrote:
>> On 2008-02-03 08:20:32 -0500, Edward Diener 
>> <eddielee_no_spam_here at tropicsoft.com> said:
>> 
>>> I am fully cognizant of a dynamically typed language since I program in 
>>> Python also. I agree there is no fixed dividing line. But the 
>>> difference between static typing and dynamic typing is well defined in 
>>> a statically typed language like D. My argument was that for 'scope' to 
>>> be really effective it needs to consider the dynamic type at run-time 
>>> and not just the static type as it exist at compile time.
>> 
>> Considering the dynamic type at runtime means you need to check if 
>> you're dealing with a reference-counted object each time you copy a 
>> reference to that object to see if it the reference count needs 
>> adjusting. This is significant overhead over the "just copy the 
>> pointer" thing you can do in a GC. Basically, just checking this will 
>> increase by two or three times the time it take to copy an object 
>> reference... I can see why Walter doesn't want that.
> 
> I am not knowledgable about the actual low-level difference between the 
> compiler statically checking the type of an object or dynamically 
> checking the type of an object, and the run-time costs involved.
> 
> Yet clearly D already has to implement code when scopes come to an end 
> in order to destroy stack-based objects, since structs ( user-define 
> value types ) are already supported and can have destructors.

Yes, and this is implemented in a simple and naive way: by adding an 
explicit call to the destructor at the end of the scope. The scope 
object cannot exist outside the scope, and thus no reference counting 
is needed in the way it's implemented currently.

> So the added overhead goes from having to identify structs which must 
> have their destructor called at the end of each scope to having to also 
> identify 'scope' objects which must have their reference count 
> decremented at the end of each scope and have their destructor called 
> if the reference count reaches 0.

Well, identifying structs can be done at compile time since you know 
exactly the type of the struct at that time. Classes are polymorphic, 
so it'd be a costly runtime check to know that, and that check is 
almost as costly as doing the reference counting itself. Given that, 
you should probably not bother at runtime and decide at compile time to 
just treat any class which has the potential to be a scope class as if 
it were one and actually do the reference counting.
> 
> 
>> 
>> Beside, the overhead of actually checking the type of the class will be 
>> approximativly the same as doing the reference counting. Given this, 
>> it's much better to always just do the reference counting than checking 
>> dynamically if it's needed.
>> 
>> 
>>> class C { ... }
>>> scope class D : C { ... }
>>> 
>>> [...]
>>> 
>>> This may make things much easier for the compiler, but it requires the 
>>> end user knowledge of 'scope', which has been specified at the class 
>>> level, to be applied at the syntax level. Intuitively I feel the 
>>> compiler can figure this out, and that 'scope' should largely be 
>>> totally transparent to the end user above at the syntax level.
>> 
>> Well, if the compiler is to be able to distinguish scope at compile 
>> time, then it needs a scope flag (either explicit or implicit) on each 
>> variable. This is exactly what Walter has proposed to do. He prefers 
>> the explicit route because going implicit isn't going to work in too 
>> many cases. For instance, let's have a function that returns a C:
>> 
>>     C makeOne() {
>>         if (/* random stuff here */)
>>             return new C;
>>         else
>>             return new D;
>>     }
>> 
>> Now let's call the function:
>> 
>>     C c = makeOne();
>> 
>> How can you know at compile time if the returned object of that 
>> function call is scoped or not? You can't, and therfore the compiler 
>> would need to add code to check if the returned object is scope or not, 
>> with a significant overhead, each time you assign a C.
>> 
>> If however you make scope known at compile time:
>> 
>>     scope C makeOne() {
>>         if (/* random stuff here */)
>>             return new C;
>>         else
>>             return new D;
>>     }
>> 
>>     scope C c = makeOne();
>> 
>> Now the compiler knows it must generate reference counting code for the 
>> following assignment, and any subsequent assignment of this type, and 
>> it won't have to generate code to dynamically everywhere you use a C 
>> check the "scopeness".
> 
> Would you agree that all you are doing here is specifically telling the 
> compiler that an object is 'scope' when it is created rather than 
> having the compiler figure it out for itself by querying the dynamic 
> type of the object at creation time ?

The compiler isn't knowleadgeable of what happens whithin every 
function call. So it can only check at runtime if the function returned 
at C or a D.

> If you do, then a much simpler, and to the point, example would be 
> based on my initial OP:
> 
> scope class C { ... }
> 
> scope C c = new C(...);
> 
> I specified that the scope keyword for creating the object is 
> redundant. The compiler can figure it out. The major difference in 
> opinion is that I think the compiler should figure it out from the 
> dynamic type of the object at run-time and not from the static type of 
> the object.

You're prefectly right: it is redundent in *this* case, and you could 
have the compiler implicitly understand that C is a scope class in 
*this* case. But consider this example:

	Object o;
	if (/* random value */)
		o = new C; // c is a scope class
	else
		o = new Object; // Object is the base class of C but isn't scope

Now, should o be automatically reference-counted because you *could* 
later create a C object and assing it to o, or should line 3 gives an 
error since the type Object isn't scope and C must only be assigned as 
scope? I'd say it should be an error.

This however could be made legal without too much difficulty:

	scope Object o;
	if (/* random value */)
		o = new C; // c is a scope class
	else
		o = new Object; // Object is the base class of C but isn't scope

Basically, you're declaring a scope Object. While Object isn't 
necessarly a scope class, you are telling the compiler to treat it as 
scope, and thus an instance of C, which must be scope, *can* be put in 
this variable. If o wasn't scope, it'd be an error to put an instance 
of a scope class in it.

But there are still many holes in this scheme in which scope now means 
reference-counted. Take this example:

	class A {
		void doSomething() {
			globalReferences ~= this;
		}
	}
	scope class B { }

	A[] globalReferences;

	scope B b = new B; // Scope could be made implicit here, but it's 
irrelevant to my example
	b.doSomething();

This last statement would call A.doSomething which would put a 
non-scoped reference to globalReferences, which would fail to retain 
the object. There are two ways around that: ignore the problem and let 
the programmer handle these cases (basically, that is what 
boost::shared_ptr would do in such a situation), or introduce a new 
keyword to decorate parameters for functions that do not keep any 
reference beyound their own call so that you don't need to duplicate 
all your functions for a scope and non-scope parameter (much like const 
is the middle ground between mutable and invariant).

(Sidenote: this keyword could be useful to implement something like 
"unique" as it was discussed in another thread, as it'd allow functions 
to be called with a unique parameter and guarenty that no external 
references are kept after the call, thus perserving uniqueness.)

-- 
Michel Fortin
michel.fortin at michelf.com
http://michelf.com/