Newbie initial comments on D language - scope
Edward Diener
eddielee_no_spam_here at tropicsoft.com
Thu Feb 7 17:10:28 PST 2008
Walter Bright wrote:
> Edward Diener wrote:
>>> It will be required as any user could declare an object instance as
>>> 'scope', and so any separately compiled code must anticipate that.
>> I agree in the sense that every object may need to carry an extra
>> reference count with it even though it will not be used for the vast
>> majority of objects, which will be GC. I do not view this as an issue.
>
> It's a very serious issue, as it essentially negates much of the
> advantage of general gc. For one example, you'll have to give up
> interior pointers.
I do not follow what having a reference count for an object has to do
with giving up interior pointers.
>
>>> It's just that if any object could be scoped based on a runtime test,
>>> that then you've got to insert that test at every assignment, copy
>>> construction, and scope exit. You've got all the overhead of RC.
>> Yes, agreed. There will be overhead to deal with 'scope' objects.
>
> It will be needed for *every* gc object, too. And not just the
> allocation for the reference count, the test has to be executed every time.
The test for a reference count is executed whenever you need to do
something if the object is a 'scope' object which you would not do for a
non-scoped object. Perhaps this is what you mean by "every time". I have
these testing "times" as assignment/copy a reference and exiting a
scope. When instantiating an object no "test" need be made since the
compiler always knows when an object is 'scope' or not when it is
created ( 'scope sometype someobject' notation or sometype has a 'scope
class' notation).
>
>> However you already have some overhead dealing with stack variables,
>> and so has C++ for its existence at the end of each scope and it sure
>> does not make C++ slower than most GC systems.
>
> If reference counting worked that well, there would be no push to add gc
> to C++0x.
No one ever said that reference counting solved all memory problems as
opposed to GC. The most obvious usage for GC which I know, over and
above reference counting, is cross-referenced objects.
>
>
>> I can not say too strongly that if RAII, via 'scope', is to work in D
>> or any other GC language, the end-user should be as oblivious as
>> possible to it working automatically. This means that class designer,
>> who surely must know whether objects of their class need RAII, tells
>> the compiler that his type is 'scope' and the end-user proceeds to use
>> objects of that type just as if he would use normal GC objects.
>>
>> Otherwise you are creating a bifurcated system which does the end-user
>> no good. Not only must the end user know something in advance about
>> the inner workings of a class ( that it needs RAII ) when the class
>> designer already knows it, but he must also use a separate notation to
>> deal with objects of that class.
>
> For those cases, all the class designer needs to do is present to the
> user the struct wrapper for the class, not the class itself.
Sure, but then there becomes a different notation for dealing with
specific classes, which nullifies the whole point of being able to
specify an RAII type ( via 'scope class' in D ).
>
>
>>> Then you have the problem that all generated code that manipulates
>>> any object must insert all the rc machinery for that object, just in
>>> case some user somewhere instantiates it as 'scope'.
>>
>> It needs to have inserted for it the mechanism which determines
>> whether that object is a 'scope' object or not. It probably needs the
>> extra int for possible reference counting. Other than that I do not
>> see what other machinery is needed for normal GC objects.
>
> Consider:
>
> void foo(C c) { C d = c; }
>
> foo() has no idea if c is ref counted or gc. Therefore, it has to check
> every time, at run time. All the machinery has to be there, just in case.
I agree.
>
>> If we are really still in the age, with vtables and alignment padding
>> and god knows what else a compiler writer needs per object to
>> correctly do his work, where another 4 bytes of int is considered
>> prohibitory, then I give up the whole idea <g>.
>
> It's not just another 4 bytes.
I meant that memory-wise it is just 4 bytes. Of course it is extra
programming from the language's point of view.
Let me try to make the case for RAII in D via 'scope' once again, by
presenting the technical details as I see it, and then you will no doubt
choose what you think best. If I am really far off please tell me about
it, otherwise there is little reason for me to try to argue and present
my idea further as you will do what you think best, and I appreciate
that you have heard me out.
First, the situations when RAII processing occurs:
1) A 'scope' object is instantiated. The internal reference count,
however you choose to implement it, is set to 1.
2) A 'scope' object's reference is assigned/copied to another object. If
the 'scope' object is not a null reference, the reference count is
incremented.
3) A 'scope' object's reference is changed through assignment. If the
old reference is not a null reference, the old reference's reference
count is decremented and if it is 0, the old object is destructed ( its
destructor is called ) and its memory is released ( the latter may
happen later through GC for all I know ).
4) A 'scope' object reaches the end of it's scope. Processing then
occurs exactly as it does in 3).
There are two ways of dealing with the identification of a 'scope' object.
The first way is through its static type, where the compiler always
knows the static type of an object and can generate the correct code in
each of the 4 instances above for a 'scope' object, and ignore any
changes to the way that normal non-scope objects are treated. This is
the easiest way from the compiler's perspective and no doubt the
fastest. There is no penalty for normal non-scope GC objects and only
the 'scope' object undergoes special, slower processing. I still have
hope that if you see fit to go this way that you will allow the user to
identify a 'scope' object either by the 'scope' keyword applied to the
instantiated object itself or by the 'scope' keyword applied to the
class type of the object. I say that because I can not conceive of a
compiler that could not figure out that an object was 'scope' because
its class type was 'scope'.
The second way is by examining its dynamic type at run-time and
generating code to take the appropriate action. This second way is
harder for the compiler to do and no doubt slower, although how much
slower is something which could only be pragmatically measured by you
with D. With this second way, every object must be tested in each of the
4 cases above to determine if it is a 'scope' object and to take the
appropriate action if it is. Obviously 4) above is the potential killer
as far as this goes because it would mean testing every reference at the
end of each scope, just in case one or more of them is a 'scope' object
and needs its end of scope processing. In the other three cases one is
dealing with a single object in a well-defined, if general, situation so
the overhead would be much less. This second way is obviously much
better from the end-user's point of view, which does not mean it is
practically a better solution by any means.
My only practical argument with all those who are certain that this
second way would be an unnecessary imposition on all the users of normal
GC objects, and want to regale me with code absolutely "proving" a
priori their case, is that once an object is determined to be normal GC
there is nothing further that needs be done for that object which would
not have been done otherwise. Of course there is overhead for
determining this in the cases above, especially with 4).
For this second way I have presented the extra reference count field,
attached internally to all objects, as a way of determining if the
object is 'scope' when doing 2), 3), or 4), with the proviso that when
doing 1) the value for all normal GC objects of this field would be set
to 0. If this is an entirely impractical solution, I am sure that if you
decide to pursue the possibility of the second way, just to see if it
can be done and what is the practical penalty in doing it, you will find
a better scheme.
More information about the Digitalmars-d
mailing list