why allocators are not discussed here

Wed Jun 26 13:10:44 PDT 2013

26-Jun-2013 21:23, H. S. Teoh пишет:

>> Both suffer from
>> a) being totally unsafe and in fact bug prone since all references
>> obtained in there are now dangling (and there is no indication where
>> they came from)
>
> How is this different from using malloc() and free() manually? You have
> no indication of where a void* came from either, and the danger of
> dangling references is very real, as any C/C++ coder knows. And I assume
> that *some* people will want to be defining custom allocators that wrap
> around malloc/free (e.g. the game engine guys who want total control).

Why the heck you people think I purpose to use malloc directly as 
alternative to whatever hackish allocator stack proposed?

Use the darn container. For starters I'd make allocation strategy a 
parameter of each containers. At least they do OWN memory.

Then refactor out common pieces into a framework of allocation helpers. 
I'd personally in the end would separate concerns into 3 entities:

1. Memory area objects - think as allocators but without the circuitry 
to do the allocation, e.g. a chunk of memory returned by malloc/alloca 
can be wrapped into a memory area object.

2. Allocators (Policies) - a potentially nested combination of such 
"circuitry" that makes use of memory areas. Free-lists, pools, stacks 
etc. Safe ones have ref-counting on memory areas, unsafe once don't. 
(Though safety largely depends on the way you got that chunk of memory)

3. Containers/Warppers as above objects that handle life-cycle of 
objects and make use of allocators. In fact allocators are part of
type but not memory area objects.

>
>> b) imagine you need to use an allocator for a stateful object. Say
>> forward range of some other ranges (e.g. std.regex) both
>> scoped/stacked to allocate its internal stuff. 2nd one may handle it
>> but not the 1st one.
>
> Yeah this is a complicated area. A container basically needs to know how
> to allocate its elements. So somehow that information has to be
> somewhere.
>
>
>> c) transfer of objects allocated differently up the call graph
>> (scope graph?), is pretty much neglected I see.
>
> They're incompatible. You can't safely make a linked list that contains
> both GC-allocated nodes and malloc() nodes.

What I mean is that if types are the same as built-ins it would be a 
horrible mistake. If not then we are talking about containers anyway.
And if these have a ref-counted pointer to their allocator then the 
whole thing is safe albeit at the cost of performance.

Sadly alias this to some built-in (=e.g. slice) allows squirreling away 
underlying reference too easily.

As such I don't believe in any of the 2 *lies*:
a) built-ins can be refurbished to use custom allocators
b) we can add opSlice/alias this or whatever to our custom type to get 
access to the underlying built-ins safely and transparently

Both are just nuclear bombs waiting a good time to explode.

That's just a bomb waiting
> to explode in your face. So in that sense, Adam's idea of using a
> different type for differently-allocated objects makes sense.

Yes, but one should be careful here as not to have exponential explosion 
in the code size. So some allocators have to be compatible and if there 
is a way to transfer ownership it'd be bonus points (and a large pot of 
these mind you).

> A
> container has to declare what kind of allocation its members are using;
> any other way is asking for trouble.

Hence my thoughts to move this piece of "circuitry" to containers 
proper. The whole idea that by swapping malloc with myMalloc you can 
translate to a wildly different allocation scheme doesn't quite hold.

I think it may be interesting to try and put a "wall" in different place 
namely in between allocation strategy and memory areas it works on.

>> I kind of wondering how our knowledgeable community has come to this.
>> (must have been starving w/o allocators way too long)
>
> We're just trying to provoke Andrei into responding. ;-)
>
>
Cool, then keep it coming but ... safety and other holes has to be taken 
care of.

> [...]
>> IMHO the only place for allocators is in containers other kinds of
>> code may just ignore allocators completely.
>
> But some people clamoring for allocators are doing so because they're
> bothered by Phobos using ~ for string concatenation, which implicitly
> uses the GC. I don't think we can just ignore that.

~= would work with any sensible array-like contianer.
~ is sadly only a convenience for scripts and/or non-performance 
(determinism) critical apps unfortunately.
>
>
>> std.algorithm and friends should imho be customized on 2 things only:
>>
>> a) containers to use (instead of array)
>> b) optionally a memory source (or allocator) f container is
>> temporary(scoped) to tie its life-time to smth.
>>
>> Want temporary stuff? Use temporary arrays, hashmaps and whatnot
>> i.e. types tailored for a particular use case (e.g. with a
>> temporary/scoped allocator in mind).
>> These would all be unsafe though. Alternative is ref-counting
>> pointers to an allocator. With word on street about ARC it could be
>> nice direction to pursue.
>
> Ref-counting is not fool-proof, though. There's always cycles to mess
> things up.

You surely shouldn't have allocators reference each other cyclically? 
Then I see this as a DAG with allocator at the bottom and objects 
referencing it.

>
>
>> Allocators (as Andrei points out in his video) have many kinds:
>> a) persistence: infinite, manual, scoped
>> b) size: unlimited vs fixed
>> c) block-size: any, fixed, or *any* up to some maximum size
>>
>> Most of these ARE NOT interchangeable!
>> Yet some are composable however I'd argue that allocators are not
>> composable but have some reusable parts that in turn are composable.
>
> I was listening to Andrei's talk this morning, but I didn't quite
> understand what he means by composable allocators. Is he talking about
> nesting, say, a GC inside a region allocated by a region allocator?

I'd say something like: fixed size region allocator  with GC as fallback.
Or pool for small allocations + malloc/free with a free-list for bigger 
allocations etc. And the stuff should be as easily composable as I just 
listed.

>> Code would have to cutter for specific flavors of allocators still
>> so we'd better reduce this problem to the selection of containers.
> [...]
>
> Hmm. Sounds like we have two conflicting things going on here:
>
> 1) En massé replacement of gc_alloc/gc_free in a certain block of code
> (which may be the entire program), e.g., for the avoidance of GC in game
> engines, etc.. Basically, the code is allocator-agnostic, but at some
> higher level we want to control which allocator is being used.

There is no allocator agnostic code that allocates. It either happens to 
call free/dispose/destroy manually (implicitly with ref-counts) or it 
does not. It either escapes references to who knows where or doesn't.

>
> 2) Specific customization of containers, etc., as to which allocator(s)
> should be used, with (hopefully) some kind of support from the type
> system to prevent mistakes like dangling pointers, escaping references,
> etc.. Here, the code is NOT allocator-agnostic; it has to be written
> with the specific allocation model in mind. You can't just replace the
> allocator with another one without introducing bugs or problems.

With another one of the same _kind_ I'd say.

> These two may interact in complex ways... e.g., you might want to use
> malloc to allocate a pool, then use a custom gc_alloc/gc_free to
> allocate from this pool in order to support language built-ins like ~
> and ~= without needing to rewrite every function that uses strings.

I guess we have to re-write them. Or don't allocate in string functions.

> Maybe we should stop conflating these two things so that we stop
> confusing ourselves, and hopefully it will be easier to analyse
> afterwards.
>

-- 
Dmitry Olshansky