Escaping the Tyranny of the GC: std.rcstring, first blood

via Digitalmars-d digitalmars-d at puremagic.com
Sat Sep 27 03:23:19 PDT 2014


On Saturday, 27 September 2014 at 09:38:35 UTC, Dmitry Olshansky 
wrote:
> As usual, structs are value types, so this feature can be 
> mis-used, no two thoughts abouts it. It may need a bit of 
> improvement in user-friendliness, compiler may help there by 
> auto-detecting common misuse.
>
> Theoretically class-es would be better choice, except that 
> question of allocation pops up immediately, then consider for 
> instance COM objects.
>
> The good thing w.r.t. to memory about structs - they are 
> themselves already allocated "somewhere", and it's only 
> ref-counted payload that is allocated and destroyed in a 
> user-defined way.
>
> And now for the killer reasons to go for struct is the 
> following:
>
> Compiler _already_ does all of life-time management and had 
> numerous bug fixes to make sure it does the right thing. In 
> contrast there is nothing for classes that tracks their 
> lifetimes to call proper hooks.

This cannot be stressed enough.

>
> Let's REUSE that mechanism we have with structs and go as 
> lightly as possible on  untested LOCs budget.
>
> Full outline, of generic to the max, dirt-cheap implementation 
> with a bit of lowering:
>
> ARC or anything close to it, is implemented as follows:
> 1. Any struct that have @ARC attached, must have the following 
> methods:
> 	void opInc();
> 	bool opDec(); // true - time to destroy
> It also MUST NOT have postblit, and MUST have destructor.
>
> 2. Compiler takes user-defined destructor and creates proper 
> destructor, as equivalent of this:
> 	if(opDec()){
> 		user__defined_dtor;
> 	}
> 3. postblit is defined as opInc().
>
> 4. any ctor has opInc() appended to its body.
>
> Everything else is taken care of by the very nature of the 
> structs.

AFAICS we don't gain anything from this, because it just 
automates certain things that can already be done manually in a 
suitably implemented wrapper struct. I don't think automation is 
necessary here, because realistically, how many RC wrappers will 
there be? Ideally just one, in Phobos.

> Now this is enough to make ref-counted stuff a bit simpler to 
> write but not much beyond. So here the next "consequences" that 
> we can then implement:
>
> 4. Compiler is expected to assume anywhere in fully inlined 
> code, that opInc()/opDec() pairs are no-op. It should do so 
> even in debug mode (though there is less opportunity to do so 
> without inlining). Consider it an NRVO of the new age, required 
> optimization.
>
> 5. If we extend opInc/opDec to take an argument, the compiler 
> may go further and batch up multiple opInc-s and opDec-s, as 
> long as it's safe to do so (e.g. there could be exceptions 
> thrown!):
>
> Consider:
>
> auto a = File("some-file.txt");
> //pass to some structs for future use
> B b = B(a);
> C c = C(a);
> a = File("other file");
>
> May be (this is overly simplified!):
>
> File a = void, b = void, c = void;
> a = File.user_ctor("some-file.txt")'
> a.opInc(2);
> b = B(a);
> c = C(a);
> a = File.user_ctor("other file");
> a.opInc();

I believe we can achieve the same efficiency without ARC with the 
help of borrowing and multiple alias this. Consider the cases 
where inc/dec can be elided:

    RC!int a;
    // ...
    foo(a);
    // ...
    bar(a);
    // ...

Under the assumption that foo() and bar() don't want to keep a 
copy of their arguments, this is a classical use case for 
borrowing. No inc/dec is necessary, and none will happen, if 
RC!int has an alias-this-ed method returning a scoped reference 
to its payload.

On the other hand, foo() and bar() could want to make copies of 
the refcounted variable. In this case, we still wouldn't need an 
inc/dec, but we need a way to express that. The solution is 
another alias-this-ed method that returns a (scoped) 
BorrowedRC!int, which does not inc/dec on 
construction/destruction, but does so on copying. (It's probably 
possible to reuse RC!int for this, a separate type is likely not 
necessary.)

The other opportunity is on moving:

     void foo() {
         RC!int a;
         // ....
         bar(a);    // last statement in foo()
     }

Here, clearly `a` isn't used after the tail call. Instead of copy 
& destroy, the compiler can resort to a move (bare bitcopy). In 
contrast to C++, this is allowed in D.

This covers most opportunities for elision of the ref counting. 
It only leaves a few corner cases (e.g. `a` no longer used after 
non-tail calls, accumulated inc/dec as in your example). I don't 
think these are worth complicating the compiler with ARC.


More information about the Digitalmars-d mailing list