D2 weak references

Leandro Lucarella llucax at gmail.com
Wed Apr 22 12:08:11 PDT 2009


Jason House, el 22 de abril a las 13:31 me escribiste:
> > > Unfortunately, that is not efficient. The state used by the GC is in
> > > flux during the sweep phase, so there is no easy lockless way to access
> > > that data. Having to acquire a lock on every weakref dereference
> > > absolutely kills performance.
> > 
> > The sweep phase don't touch mark bits. I can't possible understand why
> > you should synchronize access to a (temporarily) read-only variable.
> 
> I'm mostly regurgitating what Sean Kelly said. I know that if the sweep
> deallocates at an object, the bits marking it become invalid. I don't
> disagree that reading the bits without a lock should be possible, but
> I also trust Sean's statement that it isn't easy. I'm trying to find
> something that doesn't require reworking or rewriting large parts of
> a garbage collector.

Sean's idea was, if I recall correctly, to move the sweep phase into the
stop-the-world. My idea is different, and it doesn't need rewriting any
large part of the collector. I think the change is pretty small (and can
be easily applied to D1/Tango too).

I'm looking at the Tango code right now, and you know what? The unmarking
is already done when all threads are paused, so all you need for this to
work is a method to expose the mark bit. This can be really easily done by
extending the BlkAttr defined values (or using a reserved value if this
should not be a standard attribute) and ensuring that the mark bit is set
when new pools/bins are allocated.

This patch (Tango) should do that (untested):
------>8------>8------>8------>8------>8------>8------>8------>8------>8------
diff --git a/lib/gc/basic/gcx.d b/lib/gc/basic/gcx.d
index a760465..fd21d44 100644
--- a/lib/gc/basic/gcx.d
+++ b/lib/gc/basic/gcx.d
@@ -81,6 +81,7 @@ private
         NO_SCAN  = 0b0000_0010,
         NO_MOVE  = 0b0000_0100,
         ALL_BITS = 0b1111_1111
+        MARK_BIT = 0b1_0000_0000_0000_0000; // extension to query the mark bit
     }
 
     extern (C) void* rt_stackBottom();
@@ -519,7 +520,14 @@ class GC
             Pool *pool = gcx.findPool(p);
             assert(pool);
 
-            gcx.setBits(pool, cast(size_t)(p - pool.baseAddr) / 16, bits);
+           uint biti = cast(size_t)(p - pool.baseAddr) / 16;
+            gcx.setBits(pool, biti, bits);
+           // Set the mark bit to maintain the invariant that live objects
+           // always have the mark bit set, except when the collector is in
+           // the mark phase (useful for weak reference implementations)
+           // This is not done in setBits() to keep this bit read-only for
+           // the user.
+           pool.mark.set(biti)
         }
         return p;
     }
@@ -2576,6 +2584,8 @@ struct Gcx
 //        if (pool.nomove.nbits &&
 //            pool.nomove.test(biti))
 //            bits |= BlkAttr.NO_MOVE;
+        if (pool.mark.test(biti))
+            bits |= BlkAttr.MARK_BIT;
         return bits;
     }
 
------8<------8<------8<------8<------8<------8<------8<------8<------8<------

This implementation use an "extension" (according to Druntime specs) bit
to return the mark bit when querying objects attributes. The mark bit is
kept read-only, but I'm not sure about that (there are already plenty of
ways to break the GC now, so I don't think security is a good enough
reason to ban the user from doing nasty things ;)

If you try it, please tell me how it went.

> > Mark bits are only written when:
> > 1) Clearing all the mark bits before marking (this is done with all
> >    threads running now, and this is what it should be changed to be done
> >    when all threads are suspended)
> > 2) Marking (threads are suspended in the current implementation)
> 
> Is that what it does? I realized recently that one could simple toggle
> a mark bit and avoid a clearing phase altogether.

Yes, but that would mean to touch all the mark bits in the sweep phase,
which can have bad cache behaviour. But being a bitmap that should be
fairly cheap. But since this involve bit manipulation operations, and
clearing the bitmap can be done very efficiently clearing complete words
in one instruction, I don't see a clear gain in this.

I guess just benchmarks can have a final word on this, but I wouldn't
expect much differences in performance for both approaches.


-- 
Leandro Lucarella (luca) | Blog colectivo: http://www.mazziblog.com.ar/blog/
----------------------------------------------------------------------------
GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145  104C 949E BFB6 5F5A 8D05)
----------------------------------------------------------------------------
More people die from a champagne-cork popping,
than from poison spiders



More information about the Digitalmars-d mailing list