Smart pointers instead of GC?

Mon Feb 3 23:50:50 PST 2014

On Mon, 03 Feb 2014 22:12:18 -0800, Manu <turkeyman at gmail.com> wrote:

> On 4 February 2014 15:23, Adam Wilson <flyboynw at gmail.com> wrote:
>
>> On Mon, 03 Feb 2014 18:57:00 -0800, Andrei Alexandrescu <
>> SeeWebsiteForEmail at erdani.org> wrote:
>>
>>  On 2/3/14, 5:36 PM, Adam Wilson wrote:
>>>
>>>> You still haven't dealt with the cyclic reference problem in ARC.  
>>>> There
>>>> is absolutely no way ARC can handle that without programmer input,
>>>> therefore, it is simply not possible to switch D to ARC without adding
>>>> some language support to deal with cyclic-refs. Ergo, it is simply not
>>>> possible to seamlessly switch D to ARC without creating all kinds of
>>>> havoc as people now how memory leaks where they didn't before. In  
>>>> order
>>>> to support ARC the D language will necessarily have to grow/change to
>>>> accommodate it. Apple devs constantly have trouble with cyclic-refs to
>>>> this day.
>>>>
>>>
>>> The stock response: weak pointers. But I think the best solution is to
>>> allow some form of automatic reference counting backed up by the GC,  
>>> which
>>> will lift cycles.
>>>
>>> Andrei
>>>
>>>
>> The immediate problem that I can see here is you're now paying for TWO  
>> GC
>> algorithms. There is no traditional GC without a Mark phase (unless  
>> it's a
>> copying collector, which will scare off the Embedded guys), and the mark
>> phase is actually typically the longer portion of the pause. If you have
>> ARC backed up by a GC you'll still have to mark+collect which means the  
>> GC
>> still has to track ARC memory and then when a collection is needed, mark
>> and collect. This means that you might reduce the total number of  
>> pauses,
>> but you won't eliminate them. That in turn makes it an invalid tool for
>> RT/Embedded purposes. And of course we still have the costs of ARC. Manu
>> still can't rely on pause-free (although ARC isn't either) memory
>> management, and the embedded guys still have to pay the costs in heap  
>> size
>> to support the GC.
>>
>
> So, the way I see this working in general, is that because in the  
> majority
> case, ARC would release memory immediately freeing up memory regularly,  
> an
> alloc that would have usually triggered a collect will happen far, far  
> less
> often.
> Practically, this means that the mark phase, which you say is the longest
> phase, would be performed far less often.
>

Well, if you want the ARC memory to share the heap with the GC the ARC  
memory will need to be tracked and marked by the GC. Otherwise the GC  
might try to allocate over the top of ARC memory and vice versa. This  
means that every time you run a collection you're still marking all ARC+GC  
memory, that will induce a pause. And the GC will still STW-collect on  
random allocations, and it will still have to Mark all ARC memory to make  
sure it's still valid. So yes, there will be fewer pauses, but they will  
still be there.

> For me and my kind, I think the typical approach would be to turn off the
> backing GC, and rely on marking weak references correctly.
> This satisfies my requirements, and I also lose nothing in terms of
> facilities in Phobos or other libraries (assuming that those libraries  
> have
> also marked weak references correctly, which I expect phobos would
> absolutely be required to do).
>
> This serves both worlds nicely, I retain access to libraries since they  
> use
> the same allocator, the GC remains (and is run less often) for those that
> want care-free memory management, and for RT/embedded users, they can
> *practically* disable the GC, and take responsibility for weak references
> themselves, which I'm happy to do.
>
>
> Going the other way, GC is default with ARC support on the side, is not  
> as
>> troublesome from an implementation standpoint because the GC does not  
>> have
>> to be taught about the ARC memory. This means that ARC memory is free of
>> being tracked by the GC and the GC has less overall memory to track  
>> which
>> makes collection cycles faster. However, I don't think that the  
>> RT/Embedded
>> guys will like this either, because it means you are still paying for  
>> the
>> GC at some point, and they'll never know for sure if a library they are
>> using is going to GC-allocate (and collect) when they don't expect it.
>>
>
> It also means that phobos and other libraries will use the GC because  
> it's
> the default. Correct, I don't see this as a valid solution. In fact, I
> don't see it as a solution at all.
> Where would implicit allocations like strings, concatenations, closures  
> be
> allocated?
> I might as well just use RefCounted, I don't see this offering anything
> much more than that.
>
> The only way I can see to make the ARC crowd happy is to completely  
> replace
>> the GC entirely, along with the attendant language changes (new  
>> keywords,
>> etc) that are probably along the lines of Rust. I strongly believe that  
>> the
>> reason we've never seen a GC backed ARC system is because in practice it
>> doesn't completely solve any of the problems with either system but  
>> costs
>> quite a bit more than either system on it's own.
>
>
> Really? [refer to my first paragraph in the reply]
> It seems to me like ARC in front of a GC would result in the GC running  
> far
> less collect cycles. And the ARC opposition would be absolved of having  
> to
> tediously mark weak references. Also, the GC opposition can turn the GC
> off, and everything will still work (assuming they take care of their
> cycles).
> I don't really see the disadvantage here, except that the
> only-GC-at-all-costs-I-won't-even-consider-ARC crowd would gain a
> ref-count, but they would also gain the advantage where the GC would run
> less collect cycles. That would probably balance out.
>
> I'm certainly it would be better than what we have, and in theory,  
> everyone
> would be satisfied.

I'm not convinced. Mostly, because it's not likely going to be good news  
for the GC crowd. First, now there are two GC algos running unpredictably  
at different times, so while you *might* experience a perf win in ARC-only  
mode, we'll probably pay for it in GC-backed ARC mode, because you still  
have the chance at non-deterministic pause lengths with ARC and you have  
the GC pauses, and they happen at different times (GC pause on allocate,  
ARC pause on delete). Each individual pause length *might* be shorter, but  
there is no guarantee of that, but you end up paying more time on the  
whole than you would otherwise, remembering that with the GC on, the slow  
part of the collection has to be performed on all memory, not just the GC  
memory. So yes you might delete a bit less, but you're marking just as  
much, and you've still got those pesky ARC pauses to deal with. And in  
basically everything but games you measure time spent on resource  
management as a portion of CPU cycles over time, not time spent per frame.

That ref-count you hand-wave can actually cost quite a lot. Implementing  
ARC properly can have some serious perf implications on pointer-ops and  
count-ops due to the need to make sure that everything is performed  
atomically. And since this is a compiler thing, you can't say "Don't  
atomically operate here because I will never do anything that might race."  
because the compiler has to assume that at some point you will and the  
compiler cannot detect which mode it needs, or if a library will ruin your  
day. The only way you could get around this is with yet more syntax hints  
for the compiler like '@notatomic'.

Very quickly ARC starts needing a lot of specialized syntax to make it  
work efficiently. And that's not good for "Modern Convenience".

However, you don't have to perform everything atomically with a GC as the  
collect phase can always be performed concurrently on a separate thread  
and in most cases, the mark phase can do stop-the-thread instead of  
stop-the-world and in some cases, it will never stop anything at all. That  
can very easily result in pause times that are less than ARC on average.  
So now I've got a marginal improvement in the speed of ARC over the GC at  
best, and I still haven't paid for the GC.

And if we disable the GC to get the speed back we now require that  
everyone on the team learns the specialized rules and syntax for  
cyclic-refs. That might be relatively easy for someone coming from C++,  
but it will be difficult to teach someone coming from C#/Java, which is  
statistically the more likely person to come to D. And indeed would be  
more than enough to stop my company moving to D.

I've seen you say more than once that you can't bond with the GC, and  
believe me I understand, if you search back through the forums, you'll  
find one of the first things I did when I got here was complain about the  
GC. But what you're saying is "I can't bond with this horrible GC so we  
need to throw it out and rebuild the compiler to support ARC." All I am  
saying is "I can't bond with the horrible GC, so why don't we make a  
better one, that doesn't ruin responsiveness, because I've seen it done in  
other places and there is no technical reason D can't do the same, or  
better." Now that I've started to read the GC Handbook I am starting to  
suspect that using D, there might be a way to create a completely  
pause-less GC. Don't hold me too it, I don't know enough yet, but D has  
some unique capabilities that might just make it possible.

-- 
Adam Wilson
GitHub/IRC: LightBender
Aurora Project Coordinator