[D-runtime] Fix for array append cache holding data hostage
Steve Schveighoffer
schveiguy at yahoo.com
Thu Dec 23 13:08:03 PST 2010
Cross-posting to both phobos and druntime for a wider audience...
For those of you not familiar with how the array appending cache works, it
essentially caches recent lookups of BlkInfo for non-shared array appends.
What this means is that appending to an unshared array does not have to take the
global GC lock, and allows a lookup to happen almost immediately.
One of the things that the cache required was to keep that array in memory.
Otherwise, there was a possibility that the array would be collected and
reallocated. An append to such an array would result in the stale cache entry
being used as the block info, possibly resulting in the wrong start address,
wrong size, or wrong flags, all of which could be fatal. A good prior
discussion for this issue is listed in this bug:
http://d.puremagic.com/issues/show_bug.cgi?id=3929
In order to solve the problem, I implemented a GC collection hook which runs
*after* the mark phase, but before all threads are resumed. This hook tests all
cache entries to see if any of them are about to be collected. Any entries
which are not marked (i.e. about to be collected) are removed from the cache.
I also had to do some funky stuff with the cache since TLS data is also
scanned. So the cache is now in a malloc'd block that is freed on thread exit
(and lazily allocated on first append).
One of the best tests that showed this problem (or so I thought) was posted to
the newsgroup here:
http://www.digitalmars.com/webnews/newsgroups.php?art_group=digitalmars.D.learn&article_id=22715
However, my patch does *not* fix this problem. I've found that the true reason
for the memory being held is that other data is pointing at the block. With a
200MB block, this is understandable. In a 4GB address space, that's 1/20 of the
valid address space. Any random aligned 4-byte sequence in the stack or TLS
could point to it and prevent it from being collected.
The really bad thing about this is, once the true pointers to that block are
gone, it could be lost forever. The stack data which "points" to it may be very
high up on the stack, which means it never gets collected. It makes me wonder
if we might want to provide an unsafe 'realloc' type function for appending that
forcefully frees blocks that were used in an append loop. A 'unique' type
qualifier could go a long way here.
In any case, I still want to checkin this fix, because it at least removes the
cache as a possible culprit (and fixes 3929). I don't want to commit the patch
until people here agree on the timing. There is a possibility that it
introduces memory corruption bugs (this happened the last time I was tinkering
with array appending), so I don't want to destabilize everything if there is
good reason. Given that we just had a release, I think this would be a good
time to do it. I most likely won't get to it this weekend.
So anyone have any objections to me checking this in?
-Steve
More information about the D-runtime
mailing list