Optimization fun

H. S. Teoh via Digitalmars-d digitalmars-d at puremagic.com
Fri Nov 7 09:30:47 PST 2014


On Fri, Nov 07, 2014 at 08:46:29AM -0500, Steven Schveighoffer via Digitalmars-d wrote:
> On 11/7/14 8:43 AM, Steven Schveighoffer wrote:
> >On 11/7/14 1:05 AM, H. S. Teoh via Digitalmars-d wrote:
> >>
> >>A little more investigation revealed the culprit: a queue object
> >>implemented with subarrays, and subarrays were being used as stacks
> >>by appending with ~= and popping by decrementing .length. A
> >>dangerous combination that causes excessive reallocations! After
> >>inserting assumeSafeAppend() in the right place, the memory
> >>consumption rate dropped to more reasonable levels. It's still
> >>visibly faster than when GC is enabled, so probably there are other
> >>places with excessive allocations, but this particular change stood
> >>out because after adding assumeSafeAppend() to the code, the
> >>solver's performance *even with GC enabled* improved by about
> >>40%-50%. So this explains why calling GC.disable() had such a
> >>pronounced effect before: the GC is too busy cleaning up after
> >>unnecessarily allocation-hungry code!
> >
> >In D1 this would have worked fine. A "queue array" wrapper may be a
> >good candidate for Phobos that works just like a normal array but
> >calls assume safe append whenever decrementing length.
> 
> Also to note -- even with this, you will create garbage. As the array
> grows, it may have to relocate to a new block. The old block would be
> garbage.
> 
> Once you get large enough, there is a chance that it will just extend
> into new pages. But if it ever has to reallocate a very large array,
> then you just created a very large piece of garbage that will not be
> cleaned up.

Good point. Maybe that's where the remaining garbage is coming from:
remnants of reallocated arrays. Some of the subarrays I'm using does
grow pretty big, so things could add up.


> I'm wondering if said "queue array" should have an option to
> forcefully delete the old array. In which case, we'd have to make sure
> the data cannot be accessed directly, or mark it as unsafe.
[...]

Well, if we're going to manually manage the array's memory, then we're
back in malloc/free territory along with all of their pitfalls. :-) So I
wouldn't sweat it too much -- I'd make the queue a non-copyable
reference type (so there would be no aliasing of the queue array) and
leave it at that.

On another note, I wonder if reducing the frequency of GC collections
might help keep memory under control while minimizing the performance
hits. As a first stab, I could try to manually trigger a collection
every, say, 100,000 iterations of the main loop, which would be about 10
or so collections per run, which should keep memory usage under control
while avoiding excessively frequent collections that degrade
performance.


T

-- 
"I speak better English than this villain Bush" -- Mohammed Saeed al-Sahaf, Iraqi Minister of Information


More information about the Digitalmars-d mailing list