std.allocator needs your help

Sun Sep 22 18:35:36 PDT 2013

On 23 September 2013 09:49, Andrei Alexandrescu <
SeeWebsiteForEmail at erdani.org> wrote:

> Hello,
>
>
> I am making good progress on the design of std.allocator, and I am
> optimistic about the way it turns out. D's introspection capabilities
> really shine through, and in places the design does feel really archetypal
> - e.g. "this is the essence of a freelist allocator". It's a very good
> feeling. The overall inspiration comes from Berger's HeapLayers, but D's
> introspection takes that pattern to a whole new level.
>
> Several details are still in flux, but at the top level it seems most
> natural to divide things in two tiers:
>
> 1. Typed allocator, i.e. every request for allocation comes with the exact
> type requested;
>
> 2. Untyped allocator - traffics exclusively in ubyte[].
>
> Initially I thought it's impossible to separate the two, but it turns out
> there are profound generic realities about untyped allocators that make
> them the perfect foundation for typed allocators, without considerable
> abstraction costs.
>
> I need to prove that before I'm sure, because so far I only have an
> archetype of untyped allocators, with typed allocators to follow. There's
> always the possibility that some hitch shows itself only when actually
> trying to implement various typed allocators.
>
> Anyhow, here's the general interface of an untyped allocator. In fact I'll
> showcase the simplest allocator - the NullAllocator, which has zero memory
> to offer. (In spite of its do-absolutely-nothing stance, the NullAllocator
> is surprisingly useful as a "terminator" for certain layered allocators.)
> Anyhow, here it is (explanations follow):
>
> struct NullAllocator
> {
>     enum alignment = real.alignof;
>     enum size_t available = 0;
>     ubyte[] allocate(size_t s) shared { return null; }
>     bool expand(ref ubyte[] b, size_t minDelta, size_t maxDelta) shared
>     { assert(b is null); return false; }
>     bool reallocate(ref ubyte[] b, size_t) shared
>     { assert(b is null); return false; }
>     void deallocate(ubyte[] b) shared { assert(b is null); }
>     void collect() shared { }
>     void deallocateAll() shared { }
>     static shared NullAllocator it;
> }
>
> Primitives:
>
> * alignment is a compile-time constant yielding the alignment of allocated
> data. Many allocators offer the maximum alignment of the platform (for
> which I used real.alignof above). This constant is required for all
> allocators.
>
> * available is a property that returns how many total (not necessarily
> contiguous) bytes are available for allocation. The NullAllocator knows
> statically it has 0 bytes available so it implements it as an enum.
> Generally allocators will implement it as a @property. This property is
> optional.
>
> * allocate(s) returns a ubyte[] with length at least s, or null. (It does
> not throw on out-of-memory because that would hurt composability; it turns
> out many elemental allocators do naturally run out of memory.) This method
> is required for all allocators. In most allocators this method should be
> @safe.
>
> * expand(b, minDelta, maxDelta) grows b's length by at least minDelta (and
> on a best-effort basis by at least maxDelta) and returns true, or does
> nothing and returns false. In most allocators this should be @safe. (One
> key insight is that expand() can be made @safe whereas shrink() or
> realloc() are most often not; such mini-epiphanies are very exciting
> because they put the design on a beam guide with few principles and many
> consequences.) The precondition is that b is null or has been previously
> returned by allocate(). This method is optional.
>
> * reallocate(b, s) reallocates (like C's realloc) b to at least s bytes
> and returns true, or does nothing and returns false. It is possible that
> the memory is moved. This method is optional, and usually unsafe. (This was
> mostly introduced to accommodate C's realloc() within the allocator design.)
>
> * deallocate(b) deallocates the memory allocated for b. b must have been
> previously allocated by the same allocator. This method is usually unsafe
> (but there are notable implementations that may offer safety, such as
> unbounded freelists.) This method is optional.
>
> * deallocateAll() deallocates in one shot all memory previously allocated
> by this allocator. This method is optional, and when present is almost
> always unsafe (I see no meaningful @safe implementation.) Region allocators
> are notable examples of allocators that define this method.
>
> * collect() frees unused memory that had been allocated with this
> allocator. This optional method is implemented by tracing collectors and is
> usually @safe.
>
> * "it" is an interesting artifact. Allocators may or may not hold
> per-instance state. Those that don't are required to define a global shared
> or thread-local singleton called "it" that will be used for all calls
> related to allocation. Of course, it is preferred for maximum flexibility
> that "it" is shared so as to clarify that the allocator is safe to use
> among different threads. In the case of the NullAllocator, this is
> obviously true, but non-trivial examples are the malloc-based allocator and
> the global garbage collector, both of which implement thread-safe
> allocation (at great effort no less).
>
> There are quite a few more things to define more precisely, but this part
> of the design has become quite stable. To validate the design, I've defined
> some simple allocators (Mallocator, GCAllocator, Freelist, StackRegion,
> Region etc) but not the more involved ones, such as coalescing heaps, buddy
> system etc.
>
> If you have an interest in memory allocators and would like to design (or
> have around) an allocator that fits the API above, please post it. For
> layering purposes, don't call malloc() or sbrk() for getting large blocks
> of memory from the system; instead, use another Allocator as your parent
> allocator in a pattern as follows:
>
> struct MyAllocator(Allocator)
> {
>     static if (is(typeof(Allocator.it))) alias Allocator.it _parent;
>     else Allocator _parent;
>     ... use _parent for getting memory ...
> }
>
> This allows MyAllocator to use stateful and stateless allocators
> transparently by just referring to _parent. Of course, your allocator may
> take other generic parameters in addition to Allocator, or take none at all
> if it's a "root" allocator (sbrk() or Mallocator would be examples).
>
> Destroy!

Well it looks like a good start, but the thing I'm left wondering after
reading this is still... how is it actually used?
I think the greatest challenge is finding a simple, clean and correct way
to actually specify which allocator should be used for making allocations
throughout your code, and perhaps more troublesome; within generic code,
and then layers of generic code.

Are you intending to be able to associate a particular allocator with a
class declaration?
What about a struct declaration?
What about a region of code (ie, a call tree/branch).
What if the given allocator should be used for most of the tree, except for
a couple of things beneath that always want to use their explicit allocator?

What if I want to associate an allocator instance, not just an allocator
type (ie, I don't want multiple instances of the same type(/s) of
allocators in my code, how are they shared?

It wasn't clear to me from your demonstration, but 'collect()' implies that
GC becomes allocator-aware; how does that work?

deallocateAll() and collect() may each free a whole lot of memory, but it
seems to me that they may not actually be aware of the individual
allocations they are freeing; how do the appropriate destructors for the
sub-allocations get called?

I have a suspicion you're going to answer most of these questions with the
concept of allocator layering, but I just don't completely see it. It's
quite an additional burden of resources and management to manage the
individual allocations with a range allocator above what is supposed to be
a performance critical allocator to begin with.

C++'s design seems reasonable in some ways, but history has demonstrated
that it's a total failure, which is almost never actually used (I've
certainly never seen anyone use it).

Some allocators that I use regularly to think about:

A ring-buffer:
  * Like a region allocator I guess, but the circular nature adds some
minor details, and requires the user to mark the heap from time to time,
freeing all allocations between the old mark and the new mark.

A pool:
  * Same as a free-list, but with a maximum size, ie, finite pool of
objects pre-allocated and pre-populating the freelist.

A pool-group:
  * Allocate from a group of pools allocating differently sized objects.
(this is a good test for allocator layering, supporting a group of pools
above, and fallback to the malloc heap for large objects)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.puremagic.com/pipermail/digitalmars-d/attachments/20130923/2c21e07f/attachment-0001.html>