<div dir="ltr">On 23 September 2013 23:58, Andrei Alexandrescu <span dir="ltr"><<a href="mailto:SeeWebsiteForEmail@erdani.org" target="_blank">SeeWebsiteForEmail@erdani.org</a>></span> wrote:<br><div class="gmail_extra">
<div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="im">On 9/22/13 9:03 PM, Manu wrote:<br>
</div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="im">
On 23 September 2013 12:28, Andrei Alexandrescu<br></div>
<<a href="mailto:SeeWebsiteForEmail@erdani.org" target="_blank">SeeWebsiteForEmail@erdani.org</a> <mailto:<a href="mailto:SeeWebsiteForEmail@erdani.org" target="_blank">SeeWebsiteForEmail@<u></u>erdani.org</a>>><div class="im">
<br>
wrote:<br>
My design makes it very easy to experiment by allowing one to define<br>
complex allocators out of a few simple building blocks. It is not a<br>
general-purpose allocator, but it allows one to define any number of<br>
such.<br>
<br></div><div class="im">
Oh okay, so this isn't really intended as a system then, so much a<br>
suggested API?<br>
</div></blockquote>
<br>
For some definition of "system" and "API", yes :o).<div class="im"><br>
<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
That makes almost all my questions redundant. I'm interested in the<br>
system, not the API of a single allocator (although your API looks fine<br>
to me).<br>
I already have allocators I use in my own code. Naturally, they don't<br>
inter-operate with anything, and that's what I thought std.allocator was<br>
meant to address.<br>
</blockquote>
<br></div>
Great. Do you have a couple of nontrivial allocators (heap, buddy system etc) that could be adapted to the described API?<br></blockquote><div><br></div><div>Err, not really actually. When I use custom allocator's, it's for performance, which basically implies that it IS a trivial allocator :)</div>
<div>The common ones I use are: stack-based mark&release, circular buffers, pools, pool groups (collection of different sized pools)... that might be it actually. Very simple tools for different purposes.</div><div><br>
</div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="im">
The proposed design makes it easy to create allocator objects. How<br>
they are used and combined is left to the application.<br>
<br></div><div class="im">
Is that the intended limit of std.allocator's responsibility, or will<br>
patterns come later?<br>
</div></blockquote>
<br>
Some higher level design will come later. I'm not sure whether or not you'll find it satisfying, for reasons I'll expand on below.<div class="im"><br>
<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
Leaving the usage up to the application means we've gained nothing.<br>
I already have more than enough allocators which I use throughout my<br>
code. The problem is that they don't inter-operate, and certainly not<br>
with foreign code/libraries.<br>
This is what I hoped std.allocator would address.<br>
</blockquote>
<br></div>
Again, if you already have many allocators, please let me know if you can share some.<br>
<br>
std.allocator will prescribe a standard for defining allocators, with which the rest of std will work, same as std.range prescribes a standard for defining ranges, with which std.algorithm, std.format, and other modules work. Clearly one could come back with "but I already have my own ranges that use first/done/next instead of front/empty/popFront, so I'm not sure what we're gaining here".<br>
</blockquote><div><br></div><div>No, it's just that I'm saying std.allocator needs to do a lot more than define a contract before I can start to consider if it solves my problems.</div><div>This is a good first step though, I'm happy to discuss this, but I think discussion about the practical application may also reveal design details at this level.</div>
<div><br></div><div>It's like you say, I can rename my allocator's methods to suit an agreed standard, that'll take me 2 minutes, but it's how the rest of the universe interacts with that API that matters, and if it effectively solves my problems.</div>
<div><br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="im">
An allocator instance is a variable like any other. So you use the<br>
classic techniques (shared globals, thread-local globals, passing<br>
around as parameter) for using the same allocator object from<br>
multiple places.<br>
<br>
<br></div><div class="im">
Okay, that's fine... but this sort of manual management implies that I'm<br>
using it explicitly. That's where it all falls down for me.<br>
</div></blockquote>
<br>
I think a disconnect here is that you think "it" where I think "them". It's natural for an application to use one allocator that's not provided by the standard library, and it's often the case that an application defines and uses _several_ allocators for different parts of it. Then the natural question arises, how to deal with these allocators, pass them around, etc. etc.</blockquote>
<div><br></div><div>No, I certainly understand you mean 'them', but you lead to what I'm asking, how do these things get carried/passed around. Are they discreet, or will they invade argument lists everywhere? Are they free to flow in/out of libraries in a natural way?</div>
<div>These patterns are what will define the system as I see it.</div><div>Perhaps more importantly, where do these allocators get their memory themselves (if they're not a bottom-level allocator)? Global override perhaps, or should a memory source always be explicitly provided to a non-bottom-level allocator?</div>
<div><br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="im">
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
Eg, I want to use a library, it's allocation patterns are incompatible<br>
with my application; I need to provide it with an allocator.<br>
What now? Is every library responsible for presenting the user with a<br>
mechanism for providing allocators? What if the author forgets? (a<br>
problem I've frequently had to chase up in the past when dealing with<br>
3rd party libraries)<br>
</blockquote>
<br></div>
If the author forgets and hardcodes a library to use malloc(), I have no way around that.</blockquote><div><br></div><div>Sure, but the common case is that the author will almost certainly use keyword 'new'. How can I affect that as a 3rd party?</div>
<div>This would require me overriding the global allocator somehow... which you touched on earlier.</div><div><br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div class="im">
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
Once a library is designed to expect a user to supply an allocator, what<br>
happens if the user doesn't? Fall-back logic/boilerplate exists in every<br>
library I guess...<br>
</blockquote>
<br></div>
The library wouldn't need to worry as there would be the notion of a default allocator (probably backed by the existing GC).</blockquote><div><br></div><div>Right. So it's looking like like the ability to override the global allocator is a critical requirement.</div>
<div><br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="im">
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
And does that mean that applications+libraries are required to ALWAYS<br>
allocate through given allocator objects?<br>
</blockquote>
<br></div>
Yes, they should.</blockquote><div><br></div><div>Then we make keyword 'new' redundant?</div><div><br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="im">
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
That effectively makes the new keyword redundant.<br>
</blockquote>
<br></div>
new will still be used to tap into the global shared GC. std.allocator will provide other means of allocating memory.</blockquote><div><br></div><div>I think the system will fail here. People will use 'new', siomply because it's a keyword. Once that's boxed in a library, I will no longer be able to affect that inconsiderate behaviour from my application.</div>
<div>Again, I think this signals that a global override is necessary.</div><div><br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="im">
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
And what about the GC?<br>
</blockquote>
<br></div>
The current global GC is unaffected for the time being.<div class="im"><br>
<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
I can't really consider std.allocator intil it presents some usage patterns.<br>
</blockquote>
<br></div>
Then you'd need to wait a little bit.<br></blockquote><div><br></div><div>Okay.</div><div><br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="im">
It wasn't clear to me from your demonstration, but 'collect()'<br>
implies<br>
that GC becomes allocator-aware; how does that work?<br>
<br>
<br>
No, each allocator has its own means of dealing with memory. One<br>
could define a tracing allocator independent of the global GC.<br>
<br>
<br></div><div class="im">
I'm not sure what this means. Other than I gather that the GC and<br>
allocators are fundamentally separate?<br>
</div></blockquote>
<br>
Yes, they'd be distinct. Imagine an allocator that requests 4 MB from the GC as NO_SCAN memory, and then does its own management inside that block. User-level code allocates and frees e.g. strings or whatever from that block, without the global GC intervening.</blockquote>
<div><br></div><div>Yup, that's fine. But what if the GC isn't the bottom level? There's just another allocator underneath.</div><div>What I'm saying is, the GC should *be* an allocator, not be a separate entity.</div>
<div><br></div><div>I want to eliminate the GC from my application. Ideally, in the future, it can be replaced with an ARC, which I have become convinced is the right choice for my work.</div><div><br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div class="im">
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
Is it possible to create a tracing allocator without language support?<br>
</blockquote>
<br></div>
I think it is possible.<div class="im"><br>
<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
Does the current language insert any runtime calls to support the GC?<br>
</blockquote>
<br></div>
Aside from operator new, I don't think so.</blockquote><div><br></div><div>Okay, so a flexible lowering of 'new' is all we need for now?</div><div>It will certainly need substantially more language support for ARC.</div>
<div><br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="im">
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
I want a ref-counting GC for instance to replace the existing GC, but<br>
it's impossible to implement one of them nicely without support from the<br>
language, to insert implicit inc/dec ref calls all over the place, and<br>
to optimise away redundant inc/dec sequences.<br>
</blockquote>
<br></div>
Unfortunately that's a chymera I had to abandon, at least at this level.</blockquote><div><br></div><div>And there's the part you said I'm not going to like? ;)</div><div><br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
The problem is that installing an allocator does not get to define what a pointer is and what a reference is.</blockquote><div><br></div><div>Why not? A pointer has a type, like anything else. An ARC pointer can theoretically have the compiler insert ARC magic.</div>
<div>That does imply though that the allocator affects the type, which I don't like... I'll think on it.</div><div><br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
These are notions hardwired into the language, so the notion of turning a switch and replacing the global GC with a reference counting scheme is impossible at the level of a library API.<br></blockquote><div><br></div><div>
Indeed it is. So is this API being built upon an incomplete foundation? Is there something missing, and can it be added later, or will this design cement some details that might need changing in the future? (we all know potentially breaking changes like that will never actually happen)</div>
<div><br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
(As an aside, you still need tracing for collecting cycles in a transparent reference counting scheme, so it's not all roses.)<br></blockquote><div><br></div><div>It's true, but it's possible to explicitly control all those factors. It remains deterministic.</div>
<div><br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
What I do hope to get to is to have allocators define their own pointers and reference types. User code that uses those will be guaranteed certain allocation behaviors.</blockquote><div><br></div><div>Interesting, will this mangle the pointer type, or the object type being pointed to? The latter is obviously not desirable. Does the former actually work in theory?</div>
<div><br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="im">
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
I can easily define an allocator to use in my own code if it's entirely<br>
up to me how I use it, but that completely defeats the purpose of this<br>
exercise.<br>
</blockquote>
<br></div>
It doesn't. As long as the standard prescribes ONE specific API for defining untyped allocators, if you define your own to satisfy that API, then you'll be able to use your allocator with e.g. std.container, just the same as defining your own range as std.range requires allows you to tap into std.algorithm.</blockquote>
<div><br></div><div>I realise this. That's all fine.</div><div><br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="im">
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
Until there aren't standard usage patterns, practises, conventions that<br>
ALL code follows, then we have nothing. I was hoping to hear your<br>
thoughts about those details.<br>
</blockquote>
<br>
<br>
<br>
</div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="im">
It's quite an additional burden of resources and management to<br>
manage<br>
the individual allocations with a range allocator above what is<br>
supposed<br>
to be a performance critical allocator to begin with.<br>
<br>
<br>
I don't understand this.<br>
<br>
<br></div><div class="im">
It's irrelevant here.<br>
But fwiw, in relation to the prior point about block-freeing a range<br>
allocation;<br>
</div></blockquote>
<br>
What is a "range allocation"?<div class="im"><br>
<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
there will be many *typed* allocations within these ranges,<br>
but a typical range allocator doesn't keep track of the allocations within.<br>
</blockquote>
<br></div>
Do you mean s/range/region/?</blockquote><div><br></div><div>Yes.</div><div><br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="im">
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
This seems like a common problem that may or may not want to be<br>
addressed in std.allocator.<br>
If the answer is simply "your range allocator should keep track of the<br>
offsets of allocations, and their types", then fine. But that seems like<br>
boilerplate that could be automated, or maybe there is a<br>
different/separate system for such tracking?<br>
</blockquote>
<br></div>
If you meant region, then yes that's boilerplate that hopefully will be reasonably automated by std.allocator. (What I discussed so far predates that stage of the design.)<br>
<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="im">
C++'s design seems reasonable in some ways, but history has<br>
demonstrated<br>
that it's a total failure, which is almost never actually used (I've<br>
certainly never seen anyone use it).<br>
<br>
<br>
Agreed. I've seen some uses of it that quite fall within the notion<br>
of the proverbial exception that prove the rule.<br>
<br>
<br></div><div class="im">
I think the main fail of C++'s design is that it mangles the type.<br>
I don't think a type should be defined by the way it's memory is<br>
allocated, especially since that could change from application to<br>
application, or even call to call. For my money, that's the fundamental<br>
flaw in C++'s design.<br>
</div></blockquote>
<br>
This is not a flaw as much as an engineering choice with advantages and disadvantages on the relative merits of which reasonable people may disagree.<br>
<br>
There are two _fundamental_ flaws of the C++ allocator design, in the sense that they are very difficult to argue in favor of and relatively easy to argue against:<br>
<br>
1. Allocators are parameterized by type; instead, individual allocations should be parameterized by type.<br>
<br>
2. There is no appropriate handling for allocators with state.<br>
<br>
The proposed std.allocator design deals with (2) with care, and will deal with (1) when it gets to typed allocators.</blockquote><div><br></div><div>Fair enough. These are certainly more critical mistakes than the one I raised.</div>
<div>I'm trying to remember the details of the practical failures I ran into trying to use C++ allocators years ago.</div><div>Eventually, experience proved to us (myself and colleagues) that it wasn't worth the mess, and we simply pursued a more direct solution. I've heard similar stories from friends in other companies...</div>
<div>I need to try and recall the specific scenarios though, they might be interesting :/ .. (going back the better part of a decade >_<)</div><div><br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div class="im">
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
Well as an atom, as you say, it seems like a good first step.<br>
I can't see any obvious issues, although I don't think I quite<br>
understand the collect() function if it has no relation to the GC. What<br>
is it's purpose?<br>
</blockquote>
<br></div>
At this point collect() is only implemented by the global GC. It is possible I'll drop it from the final design. However, it's also possible that collect() will be properly defined as "collect all objects allocated within this particular allocator that are not referred from any objects also allocated within this allocator". I think that's a useful definition.</blockquote>
<div><br></div><div>Perhaps. I'm not sure how this situation arises though. Unless you've managed to implement your own GC inside an allocator.</div><div><br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div class="im">
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
If the idea is that you might implement some sort of tracking heap which<br>
is able to perform a collect, how is that actually practical without<br>
language support?<br>
</blockquote>
<br></div>
Language support would be needed for things like scanning the stack and the globals. But one can gainfully use a heap with semantics as described just above, which requires no language support.<div class="im"><br>
<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
I had imagined going into this that, like the range interface which the<br>
_language_ understands and interacts with, the allocator interface would<br>
be the same, ie, the language would understand this API and integrate it<br>
with 'new', and the GC... somehow.<br>
</blockquote>
<br></div>
The D language has no idea what a range is. The notion is completely defined in std.range.<div class="im"><br>
<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
If allocators are just an object like in C++ that people may or may not<br>
use, I don't think it'll succeed as a system. I reckon it needs deep<br>
language integration to be truly useful.<br>
</blockquote>
<br></div>
I guess that's to be seen.</blockquote><div><br></div><div>I think a critical detail to keep in mind, is that (I suspect) people simply won't use it if it doesn't interface with keyword 'new'.</div><div>
It also complicates generic code, and makes it more difficult to retrofit an allocator where 'new' is already in use.</div><div><br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div class="im">
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
The key problem to solve is the friction between different libraries,<br>
and different moments within a single application its self.<br>
I feel almost like the 'current' allocator needs to be managed as some<br>
sort of state-machine. Passing them manually down the callstack is no<br>
good. And 'hard' binding objects to their allocators like C++ is no good<br>
either.<br>
</blockquote>
<br></div>
I think it's understood that if a library chooses its own ways to allocate memory, there's no way around that.</blockquote><div><br></div><div>Are we talking here about explicit choice for sourcing memory, or just that the library allocates through the default/GC?</div>
<div><br></div><div>This is the case where I like to distinguish a bottom-level allocator from a high-level allocator.</div><div>A library probably wants to use some patterns for allocation of it's object, these are high-level allocators, but where it sources it's memory from still needs to be overridable.</div>
<div>It's extremely common that I want to enforce that a library exist entirely within a designated heap. It can't fall back to the global GC.</div><div><br></div><div>I work on platforms where memory is not unified. Different resources need to go into different heaps.</div>
<div>It has happened on numerous occasions that we have been denied a useful library simply because the author did not provide allocation hooks, and the author was not responsive to requests... leading to my favourite scenario of re-inventing yet another wheel (the story of my career).</div>
<div>It shouldn't be the case that the author has to manually account for the possibility that someone might want to provide a heap for the libraries resources.<br></div><div><br></div><div>This is equally true for the filesystem (a gripe I haven't really raised in D yet).</div>
<div><br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"> The point of std.allocator is that it defines a common interface that user code can work with.<span class="HOEnZb"><font color="#888888"><br>
<br>
<br>
Andrei<br>
<br>
</font></span></blockquote></div><br></div></div>