Newbie initial comments on D language - scope
Edward Diener
eddielee_no_spam_here at tropicsoft.com
Mon Feb 4 19:43:21 PST 2008
Michel Fortin wrote:
> On 2008-02-03 10:42:03 -0500, Edward Diener
> <eddielee_no_spam_here at tropicsoft.com> said:
>
>> Michel Fortin wrote:
>>> On 2008-02-03 08:20:32 -0500, Edward Diener
>>> <eddielee_no_spam_here at tropicsoft.com> said:
>>>
>>>> I am fully cognizant of a dynamically typed language since I program
>>>> in Python also. I agree there is no fixed dividing line. But the
>>>> difference between static typing and dynamic typing is well defined
>>>> in a statically typed language like D. My argument was that for
>>>> 'scope' to be really effective it needs to consider the dynamic type
>>>> at run-time and not just the static type as it exist at compile time.
>>>
>>> Considering the dynamic type at runtime means you need to check if
>>> you're dealing with a reference-counted object each time you copy a
>>> reference to that object to see if it the reference count needs
>>> adjusting. This is significant overhead over the "just copy the
>>> pointer" thing you can do in a GC. Basically, just checking this will
>>> increase by two or three times the time it take to copy an object
>>> reference... I can see why Walter doesn't want that.
>>
>> I am not knowledgable about the actual low-level difference between
>> the compiler statically checking the type of an object or dynamically
>> checking the type of an object, and the run-time costs involved.
>>
>> Yet clearly D already has to implement code when scopes come to an end
>> in order to destroy stack-based objects, since structs ( user-define
>> value types ) are already supported and can have destructors.
>
> Yes, and this is implemented in a simple and naive way: by adding an
> explicit call to the destructor at the end of the scope. The scope
> object cannot exist outside the scope, and thus no reference counting is
> needed in the way it's implemented currently.
The reference counting would only be implemented for a 'scope' object
only. The main overhead at the end of each scope is going through all
the objects to determine which is a 'scope' object. Perhaps this is too
expensive, but it would at least be interesting to see if it is or not.
>
>> So the added overhead goes from having to identify structs which must
>> have their destructor called at the end of each scope to having to
>> also identify 'scope' objects which must have their reference count
>> decremented at the end of each scope and have their destructor called
>> if the reference count reaches 0.
>
> Well, identifying structs can be done at compile time since you know
> exactly the type of the struct at that time. Classes are polymorphic, so
> it'd be a costly runtime check to know that, and that check is almost as
> costly as doing the reference counting itself. Given that, you should
> probably not bother at runtime and decide at compile time to just treat
> any class which has the potential to be a scope class as if it were one
> and actually do the reference counting.
Your point is well taken, but I still would like to see if the check for
a 'scope' object would be that expensive. It could be as easy as
checking an extra 'int' for reference counting for each object and
seeing whether it is 0 ( normal GC object ) or not 0 ( 'scope' object ).
>>
>>
>>>
>>> Beside, the overhead of actually checking the type of the class will
>>> be approximativly the same as doing the reference counting. Given
>>> this, it's much better to always just do the reference counting than
>>> checking dynamically if it's needed.
>>>
>>>
>>>> class C { ... }
>>>> scope class D : C { ... }
>>>>
>>>> [...]
>>>>
>>>> This may make things much easier for the compiler, but it requires
>>>> the end user knowledge of 'scope', which has been specified at the
>>>> class level, to be applied at the syntax level. Intuitively I feel
>>>> the compiler can figure this out, and that 'scope' should largely be
>>>> totally transparent to the end user above at the syntax level.
>>>
>>> Well, if the compiler is to be able to distinguish scope at compile
>>> time, then it needs a scope flag (either explicit or implicit) on
>>> each variable. This is exactly what Walter has proposed to do. He
>>> prefers the explicit route because going implicit isn't going to work
>>> in too many cases. For instance, let's have a function that returns a C:
>>>
>>> C makeOne() {
>>> if (/* random stuff here */)
>>> return new C;
>>> else
>>> return new D;
>>> }
>>>
>>> Now let's call the function:
>>>
>>> C c = makeOne();
>>>
>>> How can you know at compile time if the returned object of that
>>> function call is scoped or not? You can't, and therfore the compiler
>>> would need to add code to check if the returned object is scope or
>>> not, with a significant overhead, each time you assign a C.
>>>
>>> If however you make scope known at compile time:
>>>
>>> scope C makeOne() {
>>> if (/* random stuff here */)
>>> return new C;
>>> else
>>> return new D;
>>> }
>>>
>>> scope C c = makeOne();
>>>
>>> Now the compiler knows it must generate reference counting code for
>>> the following assignment, and any subsequent assignment of this type,
>>> and it won't have to generate code to dynamically everywhere you use
>>> a C check the "scopeness".
>>
>> Would you agree that all you are doing here is specifically telling
>> the compiler that an object is 'scope' when it is created rather than
>> having the compiler figure it out for itself by querying the dynamic
>> type of the object at creation time ?
>
> The compiler isn't knowleadgeable of what happens whithin every function
> call. So it can only check at runtime if the function returned at C or a D.
Fully agreed.
>
>> If you do, then a much simpler, and to the point, example would be
>> based on my initial OP:
>>
>> scope class C { ... }
>>
>> scope C c = new C(...);
>>
>> I specified that the scope keyword for creating the object is
>> redundant. The compiler can figure it out. The major difference in
>> opinion is that I think the compiler should figure it out from the
>> dynamic type of the object at run-time and not from the static type of
>> the object.
>
> You're prefectly right: it is redundent in *this* case, and you could
> have the compiler implicitly understand that C is a scope class in
> *this* case. But consider this example:
>
> Object o;
> if (/* random value */)
> o = new C; // c is a scope class
> else
> o = new Object; // Object is the base class of C but isn't scope
>
> Now, should o be automatically reference-counted because you *could*
> later create a C object and assing it to o, or should line 3 gives an
> error since the type Object isn't scope and C must only be assigned as
> scope? I'd say it should be an error.
I say it should be a 'scope' object. The dynamic type of o is that of a
'scope' class.
>
> This however could be made legal without too much difficulty:
>
> scope Object o;
> if (/* random value */)
> o = new C; // c is a scope class
> else
> o = new Object; // Object is the base class of C but isn't scope
>
> Basically, you're declaring a scope Object. While Object isn't
> necessarly a scope class, you are telling the compiler to treat it as
> scope, and thus an instance of C, which must be scope, *can* be put in
> this variable. If o wasn't scope, it'd be an error to put an instance of
> a scope class in it.
But then the end-user is required to know that the C is a scope class. I
do not think that should be necessary.
The whole point of 'scope' ( RAII ) in GC is that, for the most part, an
end-user should instantiate and use 'scope' classes just as he would
normal GC classes, with the language taking care to automatically
destruct an object of a 'scope' class just as soon as the last reference
to that object goes out of scope.
>
> But there are still many holes in this scheme in which scope now means
> reference-counted. Take this example:
>
> class A {
> void doSomething() {
> globalReferences ~= this;
> }
> }
> scope class B { }
>
> A[] globalReferences;
>
> scope B b = new B; // Scope could be made implicit here, but it's
> irrelevant to my example
> b.doSomething();
>
> This last statement would call A.doSomething which would put a
> non-scoped reference to globalReferences, which would fail to retain the
> object. There are two ways around that: ignore the problem and let the
> programmer handle these cases (basically, that is what boost::shared_ptr
> would do in such a situation), or introduce a new keyword to decorate
> parameters for functions that do not keep any reference beyound their
> own call so that you don't need to duplicate all your functions for a
> scope and non-scope parameter (much like const is the middle ground
> between mutable and invariant).
No, A.doSomething would put a 'scoped' reference in a non-scope array.
However if we specify 'scope A[] globalReferences;' we can solve that
problem.
Of course we may not control the declaration of 'A[] globalReferences;'.
I acknowledge that.
More information about the Digitalmars-d
mailing list