std.allocator needs your help
Andrei Alexandrescu
SeeWebsiteForEmail at erdani.org
Mon Sep 23 06:58:42 PDT 2013
On 9/22/13 9:03 PM, Manu wrote:
> On 23 September 2013 12:28, Andrei Alexandrescu
> <SeeWebsiteForEmail at erdani.org <mailto:SeeWebsiteForEmail at erdani.org>>
> wrote:
> My design makes it very easy to experiment by allowing one to define
> complex allocators out of a few simple building blocks. It is not a
> general-purpose allocator, but it allows one to define any number of
> such.
>
> Oh okay, so this isn't really intended as a system then, so much a
> suggested API?
For some definition of "system" and "API", yes :o).
> That makes almost all my questions redundant. I'm interested in the
> system, not the API of a single allocator (although your API looks fine
> to me).
> I already have allocators I use in my own code. Naturally, they don't
> inter-operate with anything, and that's what I thought std.allocator was
> meant to address.
Great. Do you have a couple of nontrivial allocators (heap, buddy system
etc) that could be adapted to the described API?
> The proposed design makes it easy to create allocator objects. How
> they are used and combined is left to the application.
>
> Is that the intended limit of std.allocator's responsibility, or will
> patterns come later?
Some higher level design will come later. I'm not sure whether or not
you'll find it satisfying, for reasons I'll expand on below.
> Leaving the usage up to the application means we've gained nothing.
> I already have more than enough allocators which I use throughout my
> code. The problem is that they don't inter-operate, and certainly not
> with foreign code/libraries.
> This is what I hoped std.allocator would address.
Again, if you already have many allocators, please let me know if you
can share some.
std.allocator will prescribe a standard for defining allocators, with
which the rest of std will work, same as std.range prescribes a standard
for defining ranges, with which std.algorithm, std.format, and other
modules work. Clearly one could come back with "but I already have my
own ranges that use first/done/next instead of front/empty/popFront, so
I'm not sure what we're gaining here".
> An allocator instance is a variable like any other. So you use the
> classic techniques (shared globals, thread-local globals, passing
> around as parameter) for using the same allocator object from
> multiple places.
>
>
> Okay, that's fine... but this sort of manual management implies that I'm
> using it explicitly. That's where it all falls down for me.
I think a disconnect here is that you think "it" where I think "them".
It's natural for an application to use one allocator that's not provided
by the standard library, and it's often the case that an application
defines and uses _several_ allocators for different parts of it. Then
the natural question arises, how to deal with these allocators, pass
them around, etc. etc.
> Eg, I want to use a library, it's allocation patterns are incompatible
> with my application; I need to provide it with an allocator.
> What now? Is every library responsible for presenting the user with a
> mechanism for providing allocators? What if the author forgets? (a
> problem I've frequently had to chase up in the past when dealing with
> 3rd party libraries)
If the author forgets and hardcodes a library to use malloc(), I have no
way around that.
> Once a library is designed to expect a user to supply an allocator, what
> happens if the user doesn't? Fall-back logic/boilerplate exists in every
> library I guess...
The library wouldn't need to worry as there would be the notion of a
default allocator (probably backed by the existing GC).
> And does that mean that applications+libraries are required to ALWAYS
> allocate through given allocator objects?
Yes, they should.
> That effectively makes the new keyword redundant.
new will still be used to tap into the global shared GC. std.allocator
will provide other means of allocating memory.
> And what about the GC?
The current global GC is unaffected for the time being.
> I can't really consider std.allocator intil it presents some usage patterns.
Then you'd need to wait a little bit.
> It wasn't clear to me from your demonstration, but 'collect()'
> implies
> that GC becomes allocator-aware; how does that work?
>
>
> No, each allocator has its own means of dealing with memory. One
> could define a tracing allocator independent of the global GC.
>
>
> I'm not sure what this means. Other than I gather that the GC and
> allocators are fundamentally separate?
Yes, they'd be distinct. Imagine an allocator that requests 4 MB from
the GC as NO_SCAN memory, and then does its own management inside that
block. User-level code allocates and frees e.g. strings or whatever from
that block, without the global GC intervening.
> Is it possible to create a tracing allocator without language support?
I think it is possible.
> Does the current language insert any runtime calls to support the GC?
Aside from operator new, I don't think so.
> I want a ref-counting GC for instance to replace the existing GC, but
> it's impossible to implement one of them nicely without support from the
> language, to insert implicit inc/dec ref calls all over the place, and
> to optimise away redundant inc/dec sequences.
Unfortunately that's a chymera I had to abandon, at least at this level.
The problem is that installing an allocator does not get to define what
a pointer is and what a reference is. These are notions hardwired into
the language, so the notion of turning a switch and replacing the global
GC with a reference counting scheme is impossible at the level of a
library API.
(As an aside, you still need tracing for collecting cycles in a
transparent reference counting scheme, so it's not all roses.)
What I do hope to get to is to have allocators define their own pointers
and reference types. User code that uses those will be guaranteed
certain allocation behaviors.
> I can easily define an allocator to use in my own code if it's entirely
> up to me how I use it, but that completely defeats the purpose of this
> exercise.
It doesn't. As long as the standard prescribes ONE specific API for
defining untyped allocators, if you define your own to satisfy that API,
then you'll be able to use your allocator with e.g. std.container, just
the same as defining your own range as std.range requires allows you to
tap into std.algorithm.
> Until there aren't standard usage patterns, practises, conventions that
> ALL code follows, then we have nothing. I was hoping to hear your
> thoughts about those details.
> It's quite an additional burden of resources and management to
> manage
> the individual allocations with a range allocator above what is
> supposed
> to be a performance critical allocator to begin with.
>
>
> I don't understand this.
>
>
> It's irrelevant here.
> But fwiw, in relation to the prior point about block-freeing a range
> allocation;
What is a "range allocation"?
> there will be many *typed* allocations within these ranges,
> but a typical range allocator doesn't keep track of the allocations within.
Do you mean s/range/region/?
> This seems like a common problem that may or may not want to be
> addressed in std.allocator.
> If the answer is simply "your range allocator should keep track of the
> offsets of allocations, and their types", then fine. But that seems like
> boilerplate that could be automated, or maybe there is a
> different/separate system for such tracking?
If you meant region, then yes that's boilerplate that hopefully will be
reasonably automated by std.allocator. (What I discussed so far predates
that stage of the design.)
> C++'s design seems reasonable in some ways, but history has
> demonstrated
> that it's a total failure, which is almost never actually used (I've
> certainly never seen anyone use it).
>
>
> Agreed. I've seen some uses of it that quite fall within the notion
> of the proverbial exception that prove the rule.
>
>
> I think the main fail of C++'s design is that it mangles the type.
> I don't think a type should be defined by the way it's memory is
> allocated, especially since that could change from application to
> application, or even call to call. For my money, that's the fundamental
> flaw in C++'s design.
This is not a flaw as much as an engineering choice with advantages and
disadvantages on the relative merits of which reasonable people may
disagree.
There are two _fundamental_ flaws of the C++ allocator design, in the
sense that they are very difficult to argue in favor of and relatively
easy to argue against:
1. Allocators are parameterized by type; instead, individual allocations
should be parameterized by type.
2. There is no appropriate handling for allocators with state.
The proposed std.allocator design deals with (2) with care, and will
deal with (1) when it gets to typed allocators.
> Well as an atom, as you say, it seems like a good first step.
> I can't see any obvious issues, although I don't think I quite
> understand the collect() function if it has no relation to the GC. What
> is it's purpose?
At this point collect() is only implemented by the global GC. It is
possible I'll drop it from the final design. However, it's also possible
that collect() will be properly defined as "collect all objects
allocated within this particular allocator that are not referred from
any objects also allocated within this allocator". I think that's a
useful definition.
> If the idea is that you might implement some sort of tracking heap which
> is able to perform a collect, how is that actually practical without
> language support?
Language support would be needed for things like scanning the stack and
the globals. But one can gainfully use a heap with semantics as
described just above, which requires no language support.
> I had imagined going into this that, like the range interface which the
> _language_ understands and interacts with, the allocator interface would
> be the same, ie, the language would understand this API and integrate it
> with 'new', and the GC... somehow.
The D language has no idea what a range is. The notion is completely
defined in std.range.
> If allocators are just an object like in C++ that people may or may not
> use, I don't think it'll succeed as a system. I reckon it needs deep
> language integration to be truly useful.
I guess that's to be seen.
> The key problem to solve is the friction between different libraries,
> and different moments within a single application its self.
> I feel almost like the 'current' allocator needs to be managed as some
> sort of state-machine. Passing them manually down the callstack is no
> good. And 'hard' binding objects to their allocators like C++ is no good
> either.
I think it's understood that if a library chooses its own ways to
allocate memory, there's no way around that. The point of std.allocator
is that it defines a common interface that user code can work with.
Andrei
More information about the Digitalmars-d
mailing list