why allocators are not discussed here

Wed Jun 26 09:38:42 PDT 2013

On Wed, Jun 26, 2013 at 01:16:31AM +0200, Adam D. Ruppe wrote:
> On Tuesday, 25 June 2013 at 22:50:55 UTC, H. S. Teoh wrote:
> >And maybe (b) can be implemented by making gc_alloc / gc_free
> >overridable function pointers? Then we can override their values
> >and use scope guards to revert them back to the values they were
> >before.
> 
> Yea, I was thinking this might be a way to go. You'd have a global
> (well, thread-local) allocator instance that can be set and reset
> through stack calls.
> 
> You'd want it to be RAII or delegate based, so the scope is clear.
> 
> with_allocator(my_alloc, {
>      do whatever here
> });
> 
> 
> or
> 
> {
>    ChangeAllocator!my_alloc dummy;
> 
>    do whatever here
> } // dummy's destructor ends the allocator scope
> 
> 
> I think the former is a bit nicer, since the dummy variable is a bit
> silly. We'd hope that delegate can be inlined.

Actually, D's frontend leaves something to be desired when it comes to
inlining delegates. It *is* done sometimes, but not as often as one may
like. For example, opApply generally doesn't inline its delegate, even
when it's just a thin wrapper around a foreach loop.

But yeah, I think the former has nicer syntax. Maybe we can help the
compiler with inlining by making the delegate a compile-time parameter?
But it forces a switch of parameter order, which is Not Nice (hurts
readability 'cos the allocator argument comes after the block instead of
before).

> But, the template still has a big advantage: you can change the
> type. And I think that is potentially enormously useful.

True. It can use different types for different allocators that does (or
doesn't) do cleanups at the end of the scope, depending on what the
allocator needs to do.

> Another question is how to tie into output ranges. Take std.conv.to.
> 
> auto s = to!string(10); // currently, this hits the gc
> 
> What if I want it to go on a stack buffer? One option would be to
> rewrite it to use an output range, and then call it like:
> 
> char[20] buffer;
> auto s = to!string(10, buffer); // it returns the slice of the
> buffer it actually used
> 
> (and we can do overloads so to!string(10, radix) still works, as
> well as to!string(10, radix, buffer). Hassle, I know...)

I think supporting the multi-argument version of to!string() is a good
thing, but what to do with library code that calls to!string()? It'd be
nice if we could somehow redirect those GC calls without having to comb
through the entire Phobos codebase for stray calls to to!string().

[...]
> The fun part is the output range works for that, and could also work
> for something like this:
> 
> struct malloced_string {
>     char* ptr;
>     size_t length;
>     size_t capacity;
>     void put(char c) {
>         if(length >= capacity)
>            ptr = realloc(ptr, capacity*2);
>         ptr[length++] = c;
>     }
> 
>     char[] slice() { return ptr[0 .. length]; }
>     alias slice this;
>     mixin RefCounted!this; // pretend this works
> }
> 
> 
> {
>    malloced_string str;
>    auto got = to!string(10, str);
> } // str is out of scope, so it gets free()'d. unsafe though: if you
> stored a copy of got somewhere, it is now a pointer to freed memory.
> I'd kinda like language support of some sort to help mitigate that
> though, like being a borrowed pointer that isn't allowed to be
> stored, but that's another discussion.

Nice!

> And that should work. So then what we might do is provide these
> little output range wrappers for various allocators, and use them on
> many functions.
> 
> So we'd write:
> 
> import std.allocators;
> import std.range;
> 
> // mallocator is provided in std.allocators and offers the goods
> OutputRange!(char, mallocator) str;
> 
> auto got = to!string(10, str);

I like this. However, it still doesn't address how to override the
default allocator in, say, Phobos functions.

> What's nice here is the output range is useful for more than just
> allocators. You could also to!string(10, my_file) or a delegate,
> blah blah blah. So it isn't too much of a burden, it is something
> you might naturally use anyway.

Now *that* is a very nice idea. I like having a way of bypassing using a
string buffer, and just writing the output directly to where it's
intended to go. I think to() with an output range parameter definitely
should be implemented. It doesn't address all of the issues, but it's a
very big first step IMO.

> >Also, we may have the problem of the wrong allocator
> >being used to free the object.
> 
> Another reason why encoding the allocator into the type is so nice.
> For the minimal D I've been playing with, the idea I'm running with
> is all allocated memory has some kind of special type, and then
> naked pointers are always assumed to be borrowed, so you should
> never store or free them.

Interesting idea. So basically you can tell which allocator was used to
allocate an object just by looking at its type? That's not a bad idea,
actually.

> auto foo = HeapArray!char(capacity);
> 
> void bar(char[] lol){}
> 
> bar(foo); // allowed, foo has an alias this on slice

This is nice. Hooray for alias this. :)

> // but....
> 
> struct A {
>    char[] lol; // not allowed, because you don't know when lol is
> going to be freed
> }
> 
> 
> foo frees itself with refcounting.

This is a bit inconvenient. So your member variables will have to know
what allocation type is being used. Not the end of the world, of course,
but not as pretty as one would like.

On Wed, Jun 26, 2013 at 03:24:57AM +0200, Adam D. Ruppe wrote:
> I was just quickly skimming some criticism of C++ allocators, since
> my thought here is similar to what they do. On one hand, maybe D can
> do it right by tweaking C++'s design rather than discarding it.
> 
> On the other hand, with all the C++ I've done, I have never actually
> used STL allocators, which could say something about me or could say
> something about them.
> 
> 
> One thing I saw said making the differently allocated object a
> different type sucks. ...but must it? The complaint there was "so
> much for just doing a function that takes a std::string". But, the
> way I'd want to do it in D is the function would take a char[]
> instead, and our special allocated type provides that via opSlice
> and/or alias this.

Yeah I think alias this adds a whole new factor into the equation. The
advantage of having a distinct type makes it much easier to implement,
and allows you to mix differently-allocated objects without having to
worry about things like calling the right version of gc_free to cleanup
properly. You can even have the same underlying data type be allocated
in two different ways, and the cleanup will happen correctly.

Basically, when you allocate some object O of class C using allocator A,
then it follows that no matter what you do with the gc_alloc/gc_free
function pointers afterwards, O must be freed using A.free. So in a
sense, O needs to carry around a function pointer to A.free in its dtor
(or whoever frees it). So this actually argues for having a distinct
type for an instance of C allocated using A, vs. an instance of C
allocated using a different allocator B. You need to store that function
pointer to A.free and B.free *somewhere*, otherwise things won't work
properly.

[...]
> Anyway, bottom line is I don't think that criticism necessarily
> applies to D.

Agreed, in D, distinct types per allocator is, at the very least, not as
bad as it is in C++.

> But there's surely many others and I'm more or less a
> n00b re c++'s allocators so idk yet.

Who *isn't* a n00b wrt to C++'s allocators, since so few people actually
use it? :-P

T

-- 
He who sacrifices functionality for ease of use, loses both and deserves
neither. -- Slashdotter