std.allocator needs your help

Tue Sep 24 08:56:59 PDT 2013

On 9/24/13 1:46 AM, Dmitry Olshansky wrote:
> 23-Sep-2013 03:49, Andrei Alexandrescu пишет:
> Looks good (s/ubyte[]/void[] per current discussion).

Changed.

> Do you imagine Typed allocators as something more then adapters that
> simplify a common pattern of allocate + emplace / destroy + deallocate?
> (create!T/destroy!T)

Well as a simple matter tracing allocators will have to store dynamic 
type information (just like the current GC does).

Also, I hope we'll be able to allow allocators to define Pointer(T), 
Ref(T) etc. that supplant replacements for the built-in notions of 
pointer, reference etc. Then, user code that uses these notions instead 
of the built-in ones will be guaranteed some nice properties (e.g. 
automatic reference counting). Unfortunately I don't see a way for an 
allocator to enforce that its users don't do illegal things such as 
escaping addresses etc. So it would be a discipline-backed scheme. 
Notable beneficiaries will be containers.

> First things first - can't allocator return alignment as run-time value
> - a property (just like 'available' does)? The implementation contract
> is that it must be O(1) vanila syscall-free function. (Read this as
> consult system info exactly once, at construction).

It could, but as I mentioned to Manu - at this level any cost is 
significant. Even changing from a compile-time constant to a global 
static dynamically-initialized constant has a cost. Making alignment an 
instance variable turns stateless allocators into stateful ones.

> Thinking more of it - it also may be immutable size_t? Then it gets
> proper value at construction and then is never changed.

I'm hoping to get away with a static function "static uint alignment()" 
that (a) is CTFEable for most allocators/platforms, (b) uses a 
dynamically-initialized static constant for a few. Then derived 
allocators will inherit CTFEability.

Would a per-allocator-type alignment suffice?

>> * deallocate(b) deallocates the memory allocated for b. b must have been
>> previously allocated by the same allocator. This method is usually
>> unsafe (but there are notable implementations that may offer safety,
>> such as unbounded freelists.) This method is optional.
>
> Does the following implication hold "have a deallocate" --> must be
> manually managed? Otherwise how would one reliably work with it and not
> leak? This brings us to traits that allocators may (should) have
> a-la automatic? free-all on termination? Zeros on allocate (more common
> then one would think)? etc.

The global GC does offer manual deallocation. It's the presence of 
collect() that indicates tracing abilities. If deallocateAll() is 
present, user code must assume it will be called during destruction.

That said, I've also been thinking of adding a variety of Boolean 
telling properties. "Zeros on allocate" has an important relationship 
with safety of higher-level objects - zeros offer "not really 
initialized but without forged pointers so memory-safe" objects.

>> * deallocateAll() deallocates in one shot all memory previously
>> allocated by this allocator. This method is optional, and when present
>> is almost always unsafe (I see no meaningful @safe implementation.)
>> Region allocators are notable examples of allocators that define this
>> method.
>
> Okay.. I presume region "mark" works by spiting off a subinstance of
> allocator and "release" by deallocateAll().

Nice, didn't think of this.

>> * collect() frees unused memory that had been allocated with this
>> allocator. This optional method is implemented by tracing collectors and
>> is usually @safe.
>>
>
> This seems hard and/or suboptimal w/o typeinfo --> typed allocators? I
> guess they could be more then a simple helper.

Yah, collect() is not really appropriate at this level...

>> There are quite a few more things to define more precisely, but this
>> part of the design has become quite stable. To validate the design, I've
>> defined some simple allocators (Mallocator, GCAllocator, Freelist,
>> StackRegion, Region etc) but not the more involved ones, such as
>> coalescing heaps, buddy system etc.
>
> The problem I see is that allocators are rooted in poor foundation -
> malloc is so out of fashion (its interface is simply too thin on
> guarantees), sbrk is frankly a stone-age syscall.
>
> I would love to make a root allocator one on top of mmap/VirtualAlloc.
> This leads me to my biggest problem with classical memory management -
> ~20 years of PCs having virtual memory support directly in the CPU, and
> yet hardly anything outside of OS takes advantage of it. They
> (allocators) seem simply not aware of it - none of interface implies
> anything that would help user take advantage of it. I'm talking of
> (potentially) large allocations that may be often resized.
>
> (large allocations actually almost always a blind spot/fallback in
> allocators)
>
> To the point - we have address space and optionally memory committed in
> there, leading us to a 3-state of an  _address range_:
> available, reserved, committed. The ranges of 2nd state are dirt cheap
> (abundant on 64bit) and could be easily flipped to committed and back.
>
> So I have a pattern I want to try to fit in your design. At present it
> is a datastructure + allocation logic. What I would want is clean and
> nice composition for ultimate reuse.
[snip]

Interesting. I'll be looking forward to what you come up with. Yes, for 
large, coarse-granularity allocations, tapping into OS' paging 
capabilities is the best way to go.

> Now how to fit this monster into the current API...
> Plainly I need a way to tell hey Mr. Allocator RESERVE me a 1Gb of RAM
> but don't chew on, it please. In other words - allocate needs something
> like a "guaranteed capacity of the block". By default it could be == size.
>
> struct Allocator
> {
> ...
> //allocate size bytes with a block of with no less then capacity bytes.
> void[] allocate(size_t size, size_t capacity=size);
> ...
> }
>
> Or maybe somehow add a reserve method explicitly...

Assuming you have an optional two-argument allocate() primitive, once 
you get the initial allocation, how do you commit it? By calling 
expand() I assume? Would that be enough?

In the reserve()-based API - how would that work? Does reserve returns a 
void[] with size zero and then you expand() it?

Andrei