[Issue 5813] [patch] std.array.Appender has severe performance and memory leak problems.

d-bugmail at puremagic.com d-bugmail at puremagic.com
Mon Jan 2 13:32:04 PST 2012


http://d.puremagic.com/issues/show_bug.cgi?id=5813



--- Comment #18 from Vladimir Panteleev <thecybershadow at gmail.com> 2012-01-02 13:32:01 PST ---
(In reply to comment #17)
> For part 1 (fastest possible 'chained' appender): Simply construct specifying a
> large number of elements. i.e.
> auto app = Appender!string(4096);
> FastestAppender7 seems to do something similar (enum MIN_SIZE  = 4096;)

The 4th one does that too, albeit implicitly. MIN_SIZE is half a page, but it
will always allocate a little over that; which will cause the GC to return a
whole page. MIN_SIZE was chosen to be the smallest size that results in an
expandable allocation.

Note that the 7th experiment is the least GC-friendly of the whole.

> As for part 2 (fastest practical 'chained' appender) I haven't written it yet.
> In essence you'd have two TLS variables, a scratch node and memory page and an
> in use flag.

Sounds like a nice idea.

> I'm referring to the fact that ptr + i => cast(T*)(cast(size_t)ptr +
> i*T.sizeof) and ptrA - ptrB => (cast(size_t)ptrA - cast(size_t)ptrB) /
> T.sizeof. Which can lead to a subtle performance issues when T.sizeof != 1 or a
> power of 2. But, my first code review doesn't reveal any worrying usages in the
> primary code path; most cases of ptrA-ptrB seem to be behind rarely used if
> conditionals.

Yes, that was my point. Only one multiplication by T.sizeof and one branch are
necessary on the hot path when appending a single item (I see that my code
doesn't follow this due to its usage of slice copies).

I wonder if putting the "cursorEnd > end" check inside the per-item loop in
such cases would be faster - it would be exchanging a multiplication with a
branch. 

> Regarding your previous assertion:
> > 4) You can write a nextCapacity function with no branches
> I'm having trouble finding this in the code. All I can find is:
> 
> auto newCapacity = newSize < MIN_SIZE ? MIN_SIZE : newSize * 2;
> 
> which contains a branch. (i.e. the ?: operator).

The main idea is in fastappender2.d. The "if" at the start can be replaced with
another bitwise or. It doesn't even matter, because that code will not be
executed as often to make a significant difference; I stated it more as a
curiosity.

> Also, I know understand better what you meant by 
> > 1) Extra indirection levels are performance killers
> Unfortunately, your attempt to reduce the number of indirections has broken one
> of the invariants of Appender; specifically that it is a reference type.
> Putting cursors, etc. into the Appender struct will result in data stomping.

I know, I said so in my answer to Andrei's comment. This is fine for my uses.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------


More information about the Digitalmars-d-bugs mailing list