RCArray is unsafe
Sativa via Digitalmars-d
digitalmars-d at puremagic.com
Mon Mar 2 23:17:09 PST 2015
On Sunday, 1 March 2015 at 15:44:49 UTC, Marc Schütz wrote:
> Walter posted an example implementation of a reference counted
> array [1], that utilizes the features introduced in DIP25 [2].
> Then, in the threads about reference counted objects, several
> people posted examples [3, 4] that broke the suggested
> optimization of eliding `opAddRef()`/`opRelease()` calls in
> certain situations.
>
> A weakness of the same kind affects DIP25, too. The core of the
> problem is borrowing (ref return as in DIP25), combined with
> manual (albeit hidden) memory management. An example to
> illustrate:
>
1 > struct T {
2 > void doSomething();
3 > }
4 > struct S {
5 > RCArray!T array;
7 > }
8 > void main() {
9 > auto s = S(RCArray!T([T()])); // s.array's refcount is
10> now 1
11> foo(s, s.array[0]); // pass by ref
++ s.array[0] = myt; // Would also be invalid
12> }
13> void foo(ref S s, ref T T) {
14> s.array = RCArray!T([]); // drop the old s.array
15> t.doSomething(); // oops, t is gone
16> }
1. Assignment to RC types is not the same as assignments to
non-RC types.
2. Allocation of RC types is more than just allocating of non-RC
types.
Using the above example:
#9: The assignment and allocation does two things:
a. Decrements the current RC of s.array(here it is null so no
RC is used)
b. Increments the ref count of the allocated RCArray by one.
#11: we pass both s and a referent to inside S. This is odd
behavior and not technically required in this case(just pass S).
Of course, we can't expect such behavior to creep up in complex
code.
#14. In this case the behavior is correct. The assignment does
the following:
a. Decrements the current RC of s.array(here was 1, so now it
is 0)
b. Increments the current RC of the newly allocated RC array
and assigns it to s.array.
#15. Since T refers to the old s.array, which now has a ref count
of 0(which we can assume to then be unallocated), line 15 becomes
a run-time error.
---
How to fix?
It seems that the only way to make it work well is to use sort of
"lazy" ref counting.
Here references are created at the start of functions and
decremented at the end of functions... rather that in place(which
then causes problems with things that happen after it).
But this can't work as in the ++ I added. Here it would solve
your problem but not fix the a generalization of it.
---
Is there a solution? Essentially no. Only at the end of the
program could we possibly have the last function release all RC
counted arrays. Since it is the end of the program it is the only
time we can know that no more RC types will be used.
The problem is analogous to the multiple-inheritance problem.
---
Flow analysis can only help so far(it won't help with dynamically
allocated types that are allocated depending on some "random"
factor.
---
Note that since t is pointing to a part of S, technically we
can't release s until t is released.
So, effectively, S has to be ref counted too! And it's ref count
must check check T's ref count. If it is non-zero then the it
can't release itself(or the reference to it's array).
Also, something eventually has to release S(or which is then
s.array). When? Well, since t depends on S, it is when t is
released. So we would naturally have to inform t that it should
too release s when it is released.
----
So, if that sort of makes sense I'll explain the solution: It
might be a bit fuzzy but I'll try to make things very clear. [I
will also sometimes conflate the usage of concrete objects and
their corresponding abstract types. e.g., "T with t". It should
be clear context which is meant. e.g., If I say "T is released",
I mean that the object with type T was released. When they need
to be distinguished between I will do so(mainly in the code
examples]
A *reference counted* type, from here on designated as RCT, is a
type that only allows itself to be released to the heap when
nothing depends on it. [Obviously if anything depends on it when
after it is released will fail to be fulfilled]
If a RCT does not contains any references to any RCT's we will
call it a free RCT or FRCT. Such types behave in a simple way.
They can be free'ed completely immediately. Since FRCT's do not
contain any references to other RCT's all their references are
simple references that can be free'ed immediately.
We require that if a type contains a reference to a RCT then it
too is a RCT. [the relation ContainsRCT(T1,T2) ==>
ContainsRCT(T2,T1) ==> Both T1 and T2 are either RCT's or not
RCT's] This isn't a big deal but without it we end up requiring
all types to be RCT's. Here we allow some types to be not be
RCT's.
In code,
class T { } // a standard non-reference counted type.
// It can't use any RCT's as references. It is your
basic D type of class.
rcclass FRCT // rcclass means reference counted class. It is
identical to your basic
// D type of class but has different keyword
designator(which is why we call them free).
{
int x;
float[] y
// etc, FRCT does not point to any RCT's
}
rcclass X { } // Just another RCT
rcclass RCT
{
FRCT myFRCT;
int y;
RCT foo(ref barRCT);
X x;
// etc Basically this is a class that can include other RCT's
}
----
Ultimately we could simply use the basic D class but we would
have to designate every type as an RCT type... even the basic
types. I'm trying to avoid some basic issues of that method that
would derail the discussion. [But for those that want to know:
one could simply treat all objects in d as rcclass types describe
above, even basic types and such, but optimize out the reference
counting part in in the compiler for those basic and FRCT
types... this then would simply lead back to the method I'm using
to describe it but simply some of the details were hidden in the
"simple method" by the compiler]
So, once you have such a type system. The way to deal with the
complex relationships that exist are to know them:
1. A RCT have the following relationships between it's containing
types. We will denote T to be a non-RCT type, FRCT to be an FRCT,
and RCT to be an RCT.
RCT can contain T's, FRCT's, and/or, RCT's.
In Code:
rcclass RCT
{
T t;
FRCT myFRCT;
RCT myRCT;
}
In this case, which is the only case we will talk about w.l.o.g),
when an RCT is released it must attempt to release each of it's
references... in example above: t, myFRCT, and myRCT. Since t is
a simple type, it can be released immediately. No RCT's point to
it because if they did, then T would be an RCT and it isn't.
Similarly myFRCT can easily be released because it is just a
complex of things like T's. In this case though there is
something slightly different about it which is the referencing
counting part. Essentially it's contained types are simple types
but it still has a reference counting aspect for itself.
Basically T's and FRCT's can free themselves when they need to.
This is because they know how to free each reference they contain
and it can't cause referential problems simply because they do
not contain such references. This by no means they won't cause
other parts of code to access invalid references. A pointer to
such a member of a class can easily try and access the member
after the object has been freed. But this is not our concern
because we are only dealing with RCT's. Our goal isn't to have
everything be RCT's. If that was the case then the T and FRCT
cases would not be needed since we won't use them. e.g., We might
want the GC to handle non-RCT and some other method for FRCT's.
This allows one to build on current D system. But one must
remember that RCT's only guarantee valid referencing when they
access other RCT's.
The last sentence above directs where we are going with this:
Since we have RCT we can make sure they know how to communicate
with each other. (we are creating a metatype here)
By having the proper communication system here it is possible for
the types to know when they need to be released.
In fact, it is not hard and several methods can be used, even
simultaneously.
Axiom: Every RCT has the ability to have any other RCT that
contains a reference to it, to be able to release it. Also, vice
versa!
[This requirement essentially allows the very final reference to
know that it needs to release other dependencies it had and
allows it to not only clean up itself but everything else it
needs to clean up]
Once you have the above requirement of RCT's(The implement dealt
with at a different time), you can lazily release objects in the
background without issue(fragmentation of the heap is a harder
matter but still has a solution because of the above requirement).
How do they communicate:
Your example will do:
rcstruct T
{
void doSomething();
}
rcstruct S
{
T[] array;
}
void main()
{
auto s = S(new T()); // Here when the compiler creates
the array of T's S knows T is a RFC.
// In this case it can get at the
hidden communications layer of T and
// inform T that it is using it.
Hence at this point
// Both S and T can communicate
with each other! (not just blind inc/dec of an int
auto t = s.array[0];
foo(s, s.array[0]); // nothing special except foo much
create an added reference to the two argments
// This can be optimized out as
long as foo returns once for every time it
// is called. In this case, At the
start of foo, the counters will be inc'ed
// and at the end they will be
dec'ed. Always a zero sum outside the function call.
t.doSomething(); // Skip to foo, come back when done
with foo! Great, t is still around!
// (RC is still zero at this point!)
}
void foo(ref S s, ref T t)
{
s.array = new T[]; // Here we are reassigning the
array. No problem! We simply inc the ref count on the new array
and decment it on the old. Just like normal ARC'ing! The only
different is that t is not immediately release! Even though it
has a RC count of 0. Think of it as lazy ARC'ing... or LARC'ing!
;)
t.doSomething(); // Great, even though t has an RC
count of zero, it hasn't been released yet!
}
So, what is this LARCING?
Essentially instead of s simply dumping t in foo, it informs t
that it is no longer using it. This sets the RC = 0 !!BUT!! t
does not free itself immediately! If it does, it leads to the
problem you have given and also to the extended problem I have
shown above(calling t.doSomething in main).
If t sticks around though, we are not releasing memory and hence
what is the point?
Well, as usual this is an engineering problem. We have a few
cases.
1. The compiler can prove that a RCT goes out of scope and is not
used any more. In this case t can release itself at the end of
the function where it is provable that it is no longer used. (In
your example it is at the end of foo ad in my example it is at
the end of main) Flow analysis, simple brute force checking, etc
are used here. Most cases in program will fall under this
category. Local variables that are not referenced to by global
variables are proven to be releasable at the end of the function
call. Hence, many cases will work as simple ARC'ing and not
require LARC'ing.
2. The compiler cannot prove when a variable is release. This is
when the type itself has to decide when it is not being used.
(this the last part of the problem and your example falls under
it)
In this case, the communication between S and T is crucial. When
s decides it no longer need t, it informs t. t has a list of
every reference that is using it(this can just be a pointer to
the ref
count of the RCT). Having this list, while potentially weighty,
lets t know when all the references to it are not needed any
more(sort of chains or links between any other RCT and t have
been broken).
Once this is done it is a simple matter of t trying to release
itself every time it goes out of scope.
a. At some point t will release itself
b. t will never be able to release itself(this is generally
due to program error in which a reference was made to t without
removing the reference when it was done). Note this is why if
every type is an RCT, prevents dangling(e.g., we would have full
LARC'ing... or FLARC'ing.). FLARC'ing may not be optimal
though(all pointers would have to be RCT which would probably be
quiet inefficient. Also interdependencies are handled easily:
s.t = t;
t.s = s;
while (!isFree(t) && !isFree(s))
{
s.t = null;
t.s = null;
if (t.refCount == 0)
s.release();
if (s.refCount == 0)
t.release();
}
The reason that t and s are released in LARC'ing is
because at the 2nd time through the loop s.release() knows that t
no longer depend and can release itself immediately(because t.s =
null set s's *t entry* to 0 on it so now s knows that t doesn't
depend on it). This is just a prototype that can be used to show
that any level of nesting between types will eventually resolve
all releases properly.
c. Alternatively, a GC like memory manager can scan through
the types and release all types who's reference count = 0 and
who's dependencies are all invalid(nothing depends on it). It can
do this in the background without issue and can compact
consecutively release objects without issue. i.e., the GC knows
what part of the heap belonged to objects there were ref counted
and if it is proven they have any references then it can mark
them as free'ed memory. Consecutive free'ed objects in memory can
be combined since they never will be used again. This can be done
behind in parallel and is sort of minor defragmentation. In fact,
analogous to defragmenting a drive. If you know a file is not
being used then you can defragment it in the background and move
the various parts around without issue because you know it is not
being used. Once it is being uses, you run into issues with
having to stop the threads that are using that part of the heap.
(I think this limitation could ultimately be removed for most
practical cases.)
In this case the GC would have to have a list of all objects of
RCT's. Essentially the GC is given to the task of LARC'ing in the
background.
Finally, which such an structural approach(Essentially
implementing the reference list, which is a random access list
containing all the references(pointers to all classes that use
the type)) to ref counting becomes pretty stable and in probably
99%+ of program usage, statically proven to release memory(step a
above, The compiler inserts many static releases that are
released immediately and many types will be simple types that
don't need LARC'ing).
The two issues with this, as always, are speed and memory usage.
I think it should rather fast(most releases are direct and just
release themselves since they don't have references. On top of
that the remaining are mostly FRCT's which can be released nearly
immediately(they can, but don't until they are informed it is ok).
For the full blown RCT types, the way they release things is in a
hierarchical manner which can create avalanches of changes. If
many objects are waiting for one object to be released, then when
it does, they all could be released immediately. But this chain
reaction is simply lazily recursive and eventually terminates.
The only real problem is how to deal with un-managed references
to RCT types. These are references(say pointers) to RCT's that
may not be "smart"(inform the RFC that it is no longer being
refered to. A simple pointer to a class that doesn't do any
communication is the prototypical problem.
The only solution seems to be to just accept it. It is a
necessary evil. Allow it but require the program to know the
intricate details. Such pointers are needed for speed and don't
inherently cause any problems. It's only when they are abused can
issues creep in.
If one, say, had RCT pointers that are used for some types of
problems and simple pointers when they are needed, it becomes
pretty marginal of an issue. (if you are doing driver programming
and have to have speed and direct access then use the simple
pointers but it is your just to know if your references are
working as expected.
In many cases, RCT pointers can be used in many of the cases
which make further minimize the singular problem case(that of
non-informative referencing). (We could call such types "Peeping
Toms"... but as long as they only peek it is ok(not by law, in
programming ;)).
At this, imagine that such a system was implemented. It would
provide almost complete automatic memory management coverage with
releasing resources as they are generally released in the program
flow. In fact, with a GC(don't you just hate it when your garbage
man doesn't pick up your trash sometimes? specially if he never
picks it up) that essentially works parallel, most of the actual
releasing can be done in the background without ever stopping the
program. The benefits are: Better real-time behavior, Some
programs might never need to compact the heap(in fact, if
intelligent heap allocation is used, probably most programs could
benefit from it), and debugging is much easier(Essentially more
like manual management for most objects... just the compiler
handles for us, but we know when, most likely, an object goes out
of scope it will be released). Seg faults due to invalid memory
access will almost be reduced to 0(since the corner case... the
problem case of simple references is not used as much as all the
other ways to use objects).
The downside? There really isn't one except memory usage! One can
come up with special cases that simply don't work well under
LARC'ing but because it allows one to use non-LARC'ing types, the
programmer will just have to manage it on is own or a mixed
combination.
Another great benefit is performance. One may think with all the
all the extra allocation and deallocation code that the end's of
functions would be slowed down(do to objects being lazily
released). This is not necessarily the case because it is
LARC'ing, the releases can be done later... say outside a
performant piece of the code. It can even be faster than directly
releasing it because that could be slow in and of itself, which,
again, with LARC'ing, we can chose a better time to release the
object or do it in the background.
Unfortunately, if I'm not mistaken, with FLARC'ing, all classes
would essentially double in size. This is because if every
reference the type has is a RCT then it would essentially need to
store a pointer to the object as to provide a link to it. With
limited LARC'ing I think one could probably achieve a good
balance. (for large classes and/or a large number of objects, one
could just avoid LARC'ing to prevent the overhead and doing it
manually or with GC)
Some codetalk(tm) of how the interaction goes:
Object s is being initialized. It is a RFC type. It has, as
a member, a object x that is also a RFC type. In the course of s
being initialized, it initializes x. We'll assume it is not null.
To do this s creates the object for x then informs x that it
has a reference to it(we could have multiple types of references
for further optimization if we want). e.g.,
x = new X;
x.CreateReference(s);
s.CreateReferenec(x); // We use directionless binding
references
Now, suppose at some point s is being released. At the end of
scope the compiler inserts
s.Release();
S now tries to release itself. It checks it's ref count and
the list of all references it has.
e.g.,
refCount = 0;
refList = {}
then in s.Release()
if (s.refCount == 0 && s.refList.IsEmpty)
{
GC.RemoveRCT(s); // GC.RemoveRCT is not necessary needed but
using it hear for expressive purposes.
s.Free();
}
else
GC.AddRCT(s); // Here, since the compiler potentially isn't
going insert another s.Release() outside the function we have to
let the GC handle it
in this case s free's itself immediately.
If either refCount != 0 or refList != empty, the GC is given
control(it is made aware that S needs to be periodically checked
to see if it can be released.(the GC simply does the
if (s.refCount == 0 && s.refList.IsEmpty) s.Free(); in a
parallel thread. If s.refCount == 0 and s.refList is empty at
some point then the GC ends up free'ing s.
Also, the compiler may be able to insert another s.Release()
in the outer scope of the function all if it releases that it s
was probably not released before. This doesn't hurt anything
except an extra check that might be wasted(profiling could remove
this check or even insert it).
The only way s can't be released is if it's refCount and
refList are not empty. This is "danling" references and can only
happen when mixing non-RCT's with RCT's.
So we know how s should release it self but how does t know
that s no longer needs it? (so t can release itself at some point)
well, this has to do when x is assigned to. e.g.,
s.x = new X;
// Hidden, we have s update it's ref list and inform both the
hold x and the new x what is going on.
here the assignment will need to remove the old x from refList
and add the new x. This keeps the the list up to date. (there is
a big corner case here that I believe would never occur but will
have to think about it more)
Any ways, I'll leave it at that for now!
More information about the Digitalmars-d
mailing list