Performance of GC.collect() for single block of `byte`s
Steven Schveighoffer
schveiguy at gmail.com
Mon Oct 1 13:51:15 UTC 2018
On 10/1/18 3:21 AM, Rainer Schuetze wrote:
>
>
> On 28/09/2018 14:21, Per Nordlöw wrote:
>> On Monday, 24 September 2018 at 14:31:45 UTC, Steven Schveighoffer wrote:
>>> It's not scanning the blocks. But it is scanning the stack.
>>>
>>> Each time you are increasing the space it must search for a given
>>> *target*. It also must *collect* any previous items at the end of the
>>> scan. Note that a collection is going to mark every single page and
>>> bitset that is contained in the item being collected (which gets
>>> increasingly larger).
>>
>> Is this because of the potentially (many) slices referencing this
>> large block?
>>
>> I assume the GC doesn't scan the `byte`-array for pointer-values in
>> this case, but that happens for `void`-arrays and class/pointer-arrays
>> right?
>>
>> Couldn't that scan be optimized by adding a bitset that indicates
>> which pages need to be scanned?
>>
>> Is it common for GC's to treat large objects in this way?
>
> A profiler reveals that most of the time is spent in "sweeping" the
> memory, i.e. looking for allocations no longer referenced. The existing
> implementation checks every page which causes a linear growth of
> required CPU resources with used memory.
Ouch! I hadn't thought of that. But we aren't checking actual *pages*,
right, we are checking bits to see where allocations are present?
I also remember that the way the bitsets work, they are always allocated
for every 16 bytes, regardless of the block size for that page/pool. I
didn't love that feature but maybe it is fixed by now.
I would imagine that checking 64 pages at once should be possible by
logic-anding the allocated and unmarked bits to check things as quickly
as possible.
>
> This version https://github.com/rainers/druntime/tree/gc_opt_sweep takes
> advantage of the known size of allocations to skip unnecessary checks.
> The last commit also adds support for keeping track of the size of
> blocks of consecutive free pages. With this your example has more or
> less constant collection time (note that most of the program time is
> spent setting the array to zero, though not measured, and that the
> allocation often triggers a collection, too).
Yes, especially when you are increasing the allocation size each time.
> I also noticed a rather serious bug for huge allocations:
> https://issues.dlang.org/show_bug.cgi?id=19281
-Steve
More information about the Digitalmars-d-learn
mailing list