GC Precision

Mon Oct 26 18:34:18 PDT 2009

== Quote from Leandro Lucarella (llucax at gmail.com)'s article
> dsimcha, el 26 de octubre a las 23:05 me escribiste:
> > I've spent some free brain cycles today thinking about this and here's the scheme
> > I have in mind.  If anyone thinks this could be improved in a way that would not
> > have substantial ripple effects throughout the compiler/language (because then it
> > might never actually get implemented) let me know.
> >
> > 1.  GC.malloc's signature changes to GC.malloc(size_t size, uint ba = 0, size_t*
> > bitmask = null).  A null bitmask means use the old-school behavior and either scan
> > everything or don't scan anything based on the NO_SCAN bit.  IMHO plain old
> > conservative scanning must be supported to allow for untyped memory blocks to be
> > allocated, because requiring every memory block to have a type associated with it
> > is an unacceptable limitation in a systems language.
> >
> > For now, the only way to get precise scanning would be to call malloc directly or
> > use a template that does.  Eventually new would either be fixed to instantiate the
> > bitMask!(T) template or eliminated entirely.  Since using the precise scanning by
> > writing a template that calls GC.malloc() is a lot easier than hacking the
> > compiler, this may be a catalyst for getting rid of new.
> Did you read the thread I posted? What do you think about Fritz's idea on
> how to encode the pointers information? I'm not very familiar with
> TypeInfo, but wouldn't be more natural to pass the TypeInfo to GC.malloc()
> directly if it can get the pointer information itself instead of
> translating that information? I think if that's possible it will keep the
> GC interface simpler.

As far as I can tell, RTTI doesn't have all the stuff that's needed yet for
structs and adding it would require hacking the compiler.  Frankly, I want this to
be simple enough to actually get implemented as opposed to just being talked about
and deadlocking on everyone waiting on everyone else to do something.

> > 2.  Some concern has been expressed in the past about the possibility of using 4
> > bytes per block in overhead to store pointers to bitmasks.  IMHO this concern is
> > misplaced because in any program that takes up enough memory for space efficiency
> > to matter, false pointers waste more space.  Furthermore, if you're programming
> > for a toaster oven, elevator controller, etc., you probably aren't going to use
> > the standard out-of-the-box GC anyhow.
> This seems reasonable, but I don't see why we can't use a more efficient
> way to store this information in the GC. Implementation simplicity (to
> have a better GC now instead of a perfect GC in some point in a far
> future) is a good enough reason :) I'm just curious if you found any flaws
> in the scheme proposed by Frits or is just you want a simpler
> implementation.

Two things:  Implementation simplicity is one.  As I've been alluding to, worse
and works in practice is better than better and only exists on paper.  The other
is that I don't understand, in Frits's approach, how do you store the size of the
object so you know how many bits to interpret as the mask?

> > However, I've thought of ways to mitigate this and have come up with the
> > following:
> >
> >     A.  Store a pointer to the bitmask iff the NO_SCAN bit is set.
> >         this is a no-brainer and will prevent any overhead on,
> >         for example, arrays of floats.
> Good idea.
> >     B.  Add another attributes bit to the GC called something like
> >         NEEDS_BITMASK.  This bit would be set iff an object mixes
> >         pointers and non-pointers.  If it's not set, no bitmask
> >         pointer would be stored.  However, the overhead of an
> >         extra bit may or may not be worth it.
> You mean for the special case where all the attributes of an object are
> pointers? I think this should be rare enough, so I doubt about the
> utility, but maybe I'm missing some common case.

You're probably right.  The only common one I can think of is arrays of classes.
For an array, the 4 bytes of overhead is usually negligible.

> > 3.  The bitmask pointer would be stored at the end of every GC-allocated block for
> > which a bitmask pointer is stored.  The reason for using the end of the block
> > instead of the beginning is just implementation simplicity:  That way, finding the
> > beginning of a block would work the same whether or not we have a bitmask pointer.
> Did you even consider storing this information outside the memory block
> (like the flags)? I think storing them in the memory block can be annoying
> if they are not stored always because now your fixed sized blocks are not
> fixed. It might be very easy to overcome this, but maybe thinking about
> the other option is worthy.

The flags in the GC are designed to store single bits apparently.  Also, as far as
weak refs, we could use another high-order bit for that.  I don't think anyone
cares if (on 32-bit) T.sizeof is limited to about a gigabyte.  The question is,
how many high-order bits do we reserve?  BTW, I'm not going to actually implement
weak references unless I see an easy way to do it, we're only talking about
reserving bits here.

I guess the bottom line is that IMHO precise heap scanning is really low-hanging
fruit where the bang for the buck would be huge.  Improving the GC in other ways
raises some nasty issues, but in this case I really think the perfect is the enemy
of the good.  I'm not guaranteeing anything, but I think I might be able to have
precise heap scanning up and running in a release or two if I really keep it
simple and stupid.