why allocators are not discussed here

Wed Jun 26 10:52:46 PDT 2013

On Wednesday, 26 June 2013 at 16:40:20 UTC, H. S. Teoh wrote:
> I think supporting the multi-argument version of to!string() is 
> a good thing, but what to do with library code that calls 
> to!string()? It'd be nice if we could somehow redirect those GC 
> calls without having to comb through the entire Phobos codebase 
> for stray calls to to!string().

Let's consider what kinds of allocations we have. We can break 
them up into two broad groups: internal and visible.

Internal allocations, in theory, don't matter. These can be on 
the stack, the gc heap, malloc/free, whatever. The function 
itself is responsible for their entire lifetime.

Changing these either optimize, in the case of reusing a region, 
or leak if you switch it to manual and the function doesn't know 
it.

Visible allocations are important because the caller is 
responsible for freeing them. Here, I really think we want the 
type system's help: either it should return something that we 
know we're responsible for, or take a buffer/output range from us 
to receive the data in the first place.

Either way, the function signature should reflect what's going on 
with visible allocations. It'd possibly return a wrapped type and 
it'd take an output range/buffer/allocator.

With internals though, the only reason I can see why you'd want 
to change them outside the function is to give them a region of 
some sort to work with, especially since you don't know for sure 
what it is doing - these are all local variables to the 
function/call stack. And here, I don't think we want to change 
the allocator wholesale.

At most, we'd want to give it hints that what we're doing are 
short lived. (Or, better yet, have it figure this out on its own, 
like a generational gc.)

So I think this is more about tweaking the gc than replacing it, 
at most adding a couple new functions to it:

GC.hint_short_lived // returns a helper struct with a static 
refcount:

TempGcAllocator {
      static int tempCount = 0;
      static void* localRegion;
      this() { tempCount++; } // pretend this works
      ~this() { tempCount--; if(tempCount == 0) 
gc.tryToCollect(localRegion); }

      T create(T, Args...)(Args args) { return GC.new_short_lived 
T(args); }
}

and gc.tryToCollect() does a quick scan for anything into the 
local region. If there's nothing in there, it frees the whole 
thing. If there is, in the name of memory safety, it just 
reintegrates that local region into the regular memory and gc's 
its components normally.

The reason the count is static is that you don't have to pass 
this thing down the call stack. Any function that wants to adapt 
to this generational hint system just calls hint_short_lived. If 
you're a leaf function, that's ok, the static count means you'll 
inherit the region from the function above you.

You would NOT use this in main(), as that defeats the purpose.

> I think to() with an output range parameter definitely
> should be implemented.

No doubt about it, we should aim for most phobos functions not to 
allocate at all, if given an output range they can use.

> Interesting idea. So basically you can tell which allocator was 
> used to allocate an object just by looking at its type?

Right, then you'll know if you have to free() it. (Or it can free 
itself with its destructor.)

> This is a bit inconvenient. So your member variables will have 
> to know what allocation type is being used. Not the end of the
> world, of course, but not as pretty as one would like.

Yeah, you'd need to know if you own them or not too (are you 
responsible for freeing that string you just got passed? If no, 
are you sure it won't be freed while you're still using it?), but 
I just think that's a part of memory management you can't 
sidestep.

There's two easy answers: 1) always make a private copy of 
anything you store (and perhaps write to) or 2) use a gc and 
trust it to always be the owner.

In any other case, I think you *have* to think about it, and the 
type telling you can help you make that decision.

> and allows you to mix differently-allocated objects without 
> having to

Important to remember though that you are borrowing these 
references, not taking ownership.

I think the rule of all pointers/slices are borrowed is fairly 
workable though. With the gc, that's ok, you don't own anything. 
The garbage collector is responsible for it all, so store away. 
(Though if it is mutable, you might want to idup it so you don't 
get overwritten by someone else. But that's a separate question 
from allocation method.... and already encoded in D's type 
system).

So never free() a naked pointer, unless you know what you're 
doing like interfacing with a C library, prefer to only free a 
ManuallyAllocated!(pointer).

hell a C library binding could change the type too, it'd still be 
binary compatible. RefCounted!T wouldn't be, but 
ManuallyAllocated!T would just be a wrapper around T*.

I think I'm starting to ramble!