std.allocator: FreeList uses simple statistics to control number of items
Andrei Alexandrescu via Digitalmars-d
digitalmars-d at puremagic.com
Wed May 20 10:28:53 PDT 2015
https://github.com/andralex/phobos/commit/90120fc290bc7840ffbee22766798518b3418e15
There is a bothersome issue with freelists fronting general-purpose
allocators (https://en.wikipedia.org/wiki/Free_list): they can grow
indefinitely. Because they keep memory allocated in their parent, they
cause fragmentation to it.
The worst pattern for a freelist is: an application allocates a bunch of
blocks of the specific size the freelist is tuned for, then frees most
of them and proceeds with allocating other sizes. In that case the large
freelist is essentially leaked.
My past FreeList implementation had a maximum list length as a
parameter. That was quite a primitive way to go about it. But what I
want is a way to model the application's behavior and automatically
adapt to the allocation patterns. In particular, if demand for blocks of
the freelist size dwindles, it should be possible for the freelist
length to progressively go down to zero.
So there are three possible allocation events:
1. The request is for a fit size and the free list is empty. That's a
miss. It will be served from the parent allocator, and upon freeing the
block will be added to the free list.
2. The request is for a fit size and the free list is not empty. That's
a hit - nice. Serve it from the free list.
3. The request is for an unfit size. That's always served from the
parent allocator, and returned to it upon deallocation.
What we want is to minimize the likelihood of a miss. Say over the past
N allocation requests we have N1, N2, and N3 counts for the three
events. Then we can estimate the probability of a miss from these past
events:
Pmiss = N1 / N
Trouble is, all counts keep increasing and there is little adaptation to
changing patterns - the stats are averaged over the entire lifetime.
It's better to keep stats over a fixed-size rolling window, say the past
Nw events. In that case, if a new miss event occurs, we update the
estimated miss probability as such:
Pmiss = (Pmiss * Nw + 1) / (Nw + 1)
That is, in the recent past we've seen Pmiss * Nw misses, now we see one
more (+ 1), and we divide everything by the increased number of events.
In case of a non-miss (hit or unfit), the update is:
Pmiss = Pmiss * Nw / (Nw + 1)
Note how the first equation converges to Pmiss = 1 and the second to
Pmiss = 0. Nice.
In a batch of events Nbatch of which there were Nmiss misses, the update is:
Pmiss = (Pmiss * Nw + Nmiss) / (Nw + Nbatch)
So after each allocation event, or after a few of them, we have a nice
estimate of the probability we're missing the opportunity of allocating
off the freelist. What I'm less sure about is how to efficiently wire
that information into a feedback loop. Specifically:
1. How to update Pmiss efficiently? Most allocations should be fast, so
it shouldn't be update with every call to allocate(). What I have now is
I update every K calls.
2. How and when should the estimated Pmiss be used to reduce the size of
the freelist? What I do now is when Pmiss is below a fixed threshold, I
just periodically yank one element off of it.
Better ideas are welcome!
Andrei
More information about the Digitalmars-d
mailing list