Idea for allocators
Diggory
diggsey at googlemail.com
Fri May 31 11:52:02 PDT 2013
So, I've been thinking about a few of the current problems with D:
- No allocators on containers
- Standard library functions doing too much GC allocation
- Escaping pointers to memory not allocated using the GC
- Implicit allocation with "~", "~=" and array literals
And I came up with something that might be able to solve a few of
these:
string Test(Alloc = allocator(return))(string a, string b) {
return a ~ b;
}
Escape analysis would be done by the compiler for every
allocation, whether that's implicit via "~", explicit with "new"
or whatever.
This will result in a list of ways that a reference to the
allocated memory can escape, which can contain:
- By assignment to a global
- By return value
- By a particular parameter (if parameter is ref or contains a
pointer)
- By the "this" parameter
If there are multiple ways it could escape then a partial
ordering can help the compiler choose the most general, or if
there is no reasonable ordering then it could error.
In each case the allocator can be specified using a template
parameter:
- string Test(Alloc = allocator(global))(string a, string b);
- string Test(Alloc = allocator(return))(string a, string b);
- string Test(Alloc = allocator("a"))(string a, string b);
- string Test(Alloc = allocator(this))(string a, string b);
Multiple values could also be specified:
- string Test(Alloc = allocator("b", return))(string a, string b);
This does two things - it tells the caller what the allocator
will be used for, and it helps the compiler decide which
allocator to use for each allocation. If an allocation can't be
escaped at all then it should be allocated on the stack, or at
least using a stack/region allocator for best performance.
Anyway, going back to this case:
string Test(Alloc = allocator(return))(string a, string b) {
return a ~ b;
}
The compiler can see that the allocation caused by "a ~ b" can
only be escaped via the return value, so it will automatically
use the type "Alloc" as the allocator for that allocation.
If "Test" is called like so:
void Test2() {
auto result = Test("Hello ", "world!");
if (result.length > 5)
writeln("Blah");
}
The "Alloc" parameter has a default value of "allocator(...)"
which means that the caller should try to figure out what to pass
in. "allocator(return)" means it will be used to allocate the
return value, so the compiler performs escape analysis on the
return value and finds out that it never escapes, and so provides
a simple stack/region allocator.
The Alloc parameter is a normal template parameter aside from its
default value so you can always explicitly specify a different
allocator to use (saves having GC and no-GC versions of each
phobos function). It can also be used directly as an allocator
from within the function.
It could also be used with non-function templates such as
containers, although the only useful defaults would be
"allocator(this)" and "allocator(global)". The allocator would
still be filled in automatically by the compiler if not specified
so that it could potentially allocate an entire container in a
stack/region allocator and all transparent to the caller.
In cases where there is no allocator such as in a non-template it
will fall back to using the GC and so be completely backward
compatible.
This could all be quite difficult to implement but it does
provide some nice benefits:
- In most cases the only thing needed to take advantage is to add
"Alloc = allocator(return)" to the template parameters
- Should massively reduce GC usage and cost of allocations (think
toLower, etc.)
- No new syntax apart from the keyword "allocator"
- Can still use "~" and all the other nice features of D even in
performance critical/no-gc code
- Compiler analysis required is confined to a single function at
a time
- The biggest problem with allocators in C++ is that nobody
actually bothers to use them. Since in this case the best
allocator is chosen automatically that's not a problem.
More information about the Digitalmars-d
mailing list