Memory Management in D: Request for Comment

dsimcha dsimcha at yahoo.com
Mon Nov 2 21:46:29 PST 2009


During my occasional forays into the dark side of Python and Java, I am often
amazed at the extent to which memory management in these languages "just
works".  D should be like this for all but the most low-level programming
tasks, and was intended to be.  It seems like most of the other regulars here
have carved out niches that don't involve improving memory management.

My attempts at adding precise heap scanning to the GC got me thinking about
other ways to improve memory management in D.  I love most aspects of D, but
the constant memory management issues make it feel like much less of a
high-level language than it should feel like.  I'm thinking of making this my
niche around here, as I already know more about the problem than I ever wanted
to know and I'm sick of having memory management not work properly.  Here are
some things that I'd like comments on:

1.  In the Hans Boehm GC, they use a blacklisting scheme whereby they avoid
allocating memory pages that currently have false pointers pointing into them.
 (If a page is not allocated but looks like it has a pointer into it, then we
can assume this is a false pointer.)  If I dip into the GC code again and
implement something like this, we'll be one step closer to making D memory
management "just work" and making false pointers a thing of the past.

2.  I've mentioned this a few times here before, but I wrote a second
stack-based memory allocator called TempAlloc, which allows stack-based memory
allocation that is not necessarily bound to function calls and automatically
falls back on heap allocation for very large objects, rather than killing your
program with a "stack overflow" error message.  I've also written some
implementations of common data structures (so far arrays, hash tables and
sets; I'll probably add binary trees at some point) that are optimized for it.
 (see http://svn.dsource.org/projects/dstats/docs/alloc.html).

The biggest problem with TempAlloc is that it is not scanned by the GC,
meaning that you can't store heap-allocated data in it unless you have another
reference somewhere else.  I don't know how to remedy this, partly because the
stacks are thread-local and I don't know how to remove a range from the GC
upon a thread terminating, even if I hack the GC to give it the features it
needs to properly scan TempAlloc.  Advice would be appreciated.

Other than GC scanning, is there anything else you would like to see added to
TempAlloc?  Do you think it's general enough to be included in core.memory, or
is it too niche?

3.  This one is an order of magnitude less likely than the other two to
actually get implemented, at least by me, but how about thread-local
allocators so you can call malloc() without taking a lock?  I vaguely remember
Sean saying he was working on that a while back, but I never heard anything
about it again.  It's probably best to wait for shared to be implemented for
this so that unshared objects can also be collected w/o stopping the world,
but we should start at least discussing this now.


4.  I submitted a patch a while back to allow the GC to ignore interior
pointers for specific objects.
(http://d.puremagic.com/issues/show_bug.cgi?id=2927)  This would be useful if
you have, for example, a large array that never escapes a class and the class
always maintains a pointer to the head of the array as long as it's alive.
This way, when the class dies, the array dies too even if there are false
pointers to its interior.  Few people have commented on this.  Is there any
reason why it's not a good idea?  Yes, it's somewhat unsafe if you're not
careful, but when the alternative is horrible memory leaks, sometimes
unsafeness is a necessary evil.



More information about the Digitalmars-d mailing list