[phobos] opinions on bug 5370
dmitry.olsh at gmail.com
Mon Aug 8 06:24:18 PDT 2011
On 08.08.2011 17:07, Steve Schveighoffer wrote:
>> From: Dmitry Olshansky<dmitry.olsh at gmail.com>
>> On 04.08.2011 7:21, Brad Roberts wrote:
>>> It's listed as a regression, but it looks like a deliberate behavior change. My personal opinion is that buffers don't
>>> contain pointers. They shouldn't participate in garbage collection.
>>> Either way, a decision needs to be made and either the changes applied or the bug closed as won't fix.
>> When I filed it I was certain it's a regression, I discovered it while chasing memory corruption in my D2 port of DMDscript.
>> All boiled down to "it works untill garbage collection", and then I found the cause: the original code stores pointers to heap allocated objects in OutBuffer. So I think it _was_ supposed to be able to hold pointers to heap allocated stuff.
>> Also grep code for:
>> GC.clrAttr(data.ptr, GC.BlkAttr.NO_SCAN);
>> I think this made it to scan GC before array LRU append cache. Nowdays this part of code does nothing, since it somehow doesn't point to _block_. At any rate, I can wrap up the patch as pull request if I convinced anybody.
> I think I can shed some light on this.
> It's not so much the LRU cache but the array append code (that is the code that stores the number of *used* bytes).
> In the case where a shared array gets longer than 4096 bytes (including metadata), there is the possibility to *extend* into consecutive pages. If I store the metadata at the end of the block, then this scenario may occur:
> 1. thread A appends, gets the block info
> 2. thread B appends, gets the block info
> 3. thread A is able to extend to a consecutive page. The metadata is moved to the end of the new page, and overwrites the original metadata with the appended data.
> 4. thread B, using its stale block info, reads the overwritten metadata, corruption ensues.
> So I decided for blocks>= PAGESIZE should store the metadata at the beginning of the block. This way, at least when using stale block info, thread B still sees valid metdata, and reallocates.
> The other option I think is to hold the GC lock for the entire append operation, not just for the block lookup (I have a separate lock used to prevent races when modifying shared metadata).
> If you have other ideas, I'd be open to changing it.
Now it makes it abundantly clear why the code here is failing, in my fix
I just changed underlying array type to void avoiding all the tricky
calls to GC.clearAttr. As for the position of metadata, I'm no expert on
runtime and GC can't really say help here. I think outbuffer might be
not so extensively used as other parts of phobos so the this error
hasn't surfaced before.
> BTW, in my opinion, I think OutBuffer should simply be templated on the array type.
Agreed, it looks like the best solution to me. Still the default was to
scan array, so if we are changing this default, at minimum it should be
noted it in documentation and changelog.
More information about the phobos