std.allocator needs your help

Manu turkeyman at gmail.com
Mon Sep 23 22:47:57 PDT 2013


On 24 September 2013 03:53, Andrei Alexandrescu <
SeeWebsiteForEmail at erdani.org> wrote:

> On 9/23/13 8:32 AM, Manu wrote:
>
>> On 23 September 2013 23:58, Andrei Alexandrescu
>> <SeeWebsiteForEmail at erdani.org <mailto:SeeWebsiteForEmail@**erdani.org<SeeWebsiteForEmail at erdani.org>
>> >>
>>
>> wrote:
>> This is a good first step though, I'm happy to discuss this, but I think
>> discussion about the practical application may also reveal design
>> details at this level.
>>
>
> Absolutely. This is refreshing since I've gone through the cycle of "types
> must be integrated into allocators and with the global GC" ... -> ...
> "untyped allocators can actually be extricated" once. It is now great to
> revisit the assumptions again.
>
> One matter I'm rather surprised didn't come up (I assumed everyone would
> be quite curious about it) is that the allocator does not store the
> allocated size, or at least does not allow it to be discovered easily. This
> is a definitely appropriate topic for the design at hand.
>
> The malloc-style APIs go like:
>
> void* malloc(size_t size);
> void free(void*);
>
> Note that the user doesn't pass the size to free(), which means the
> allocator is burdened with inferring the size of the block from the pointer
> efficiently. Given that most allocators make crucial strategic choices
> depending on the size requested, this API shortcoming is a bane of
> allocator writers everywhere, and a source of inefficiency in virtually all
> contemporary malloc-style allocators.
>
> This is most ironic since there is next to nothing that user code can do
> with a pointer without knowing the extent of memory addressed by it. (A
> notable exception is zero-terminated strings, another questionable design
> decision.) It all harkens back to Walter's claim of "C's biggest mistake"
> (with which I agree) of not defining a type for a bounded memory region.
>
> Upon studying a few extant allocators and D's lore, I decided that in D
> things have evolved sufficiently to have the user pass the size upon
> deallocation as well:
>
> void[] allocate(size_t size);
> void deallocate(void[] buffer);
>
> This is because the size of D objects is naturally known: classes have it
> in the classinfo, slices store it, and the few cases of using bald pointers
> for allocation are irrelevant and unrecommended.
>
> This all makes memory allocation in D a problem fundamentally simpler than
> in C, which is quite an interesting turn :o).


Yeah, that's definitely cool.
I had previously just presumed the allocator would stash the size somewhere
along with it's allocation record (as usual), but this does potentially
simplify some allocators.

 No, I certainly understand you mean 'them', but you lead to what I'm
>> asking, how do these things get carried/passed around. Are they
>> discreet, or will they invade argument lists everywhere?
>>
>
> Since there's a notion of "the default" allocator, there will be ways to
> push/pop user-defined allocators that temporarily (or permanently) replace
> the default allocator. This stage of the design is concerned with allowing
> users to define such user-defined allocators without having to implement
> them from scratch.


I get that about this stage of design. Buy my question was about the next
stage, that is, usage/management of many instances of different allocators,
how they are carried around, and how they are supplied to things that want
to use them.
Prior comments lead me to suspect you were in favour of exclusive usage of
this new API, and the 'old' new keyword would continue to exist to just
allocate GC memory. At least that's the impression I got.
This leads to a presumption that parameter lists will become clogged with
allocators if they must be micro-managed, references to allocators will end
up all over the place. I'm not into that. I want to know how it will look
in practise.

I appreciate you think that's off-topic, but I find it pretty relevant.

 Sure, but the common case is that the author will almost certainly use
>> keyword 'new'. How can I affect that as a 3rd party?
>> This would require me overriding the global allocator somehow... which
>> you touched on earlier.
>>
>
> The way I'm thinking of this is to install/uninstall user-defined
> allocators that will satisfy calls to new. Since not all allocators support
> tracing, some would require the user to deallocate stuff manually.
>

So we need to keep delete then? I also think it may remain useful for
allocators that don't clean up automatically.

     The library wouldn't need to worry as there would be the notion of a
>>     default allocator (probably backed by the existing GC).
>>
>> Right. So it's looking like like the ability to override the global
>> allocator is a critical requirement.
>>
>
> Again, this is difficult to handle in the same conversation as "should we
> pass size upon deallocation". Yes, we need road laws, but it's hard to talk
> about those and engine lubrication in the same breath. For me,
> "critical" right now is to assess whether the untyped API misses an
> important category of allocators, what safety level it has (that's why
> "expand" is so different from "realloc"!) etc.
>

Then we should fork the thread? Or I'll just wait for the next one?
I'm happy to do that, but as I said before, I suspect 'that' discussion
will have impact on this one though, so however awkward, I think it's
relevant.

         And does that mean that applications+libraries are required to
>>         ALWAYS
>>         allocate through given allocator objects?
>>
>>
>>     Yes, they should.
>>
>>
>> Then we make keyword 'new' redundant?
>>
>
> Probably not. Typed allocators will need to integrate with (and
> occasionally replace) the global shared GC.
>

'typed allocators'? Are they somehow fundamentally separate from this
discussion?
They seem like just a layer above which construct/destruct/casts. Do you
anticipate the typed allocation interface to be significantly more than
just a light wrapper?

     The problem is that installing an allocator does not get to define
>>     what a pointer is and what a reference is.
>>
>> Why not?
>>
>
> Because that requires a language change. I'm not sure you realize but you
> move the goalposts all the time. We were talking within the context of
> libraries and installing allocators dynamically and all of a sudden you get
> to change what the compiler thinks a pointer is.
>

Deprecating new is definitely a language change.

You said (down lower) "have allocators define their own pointers and
reference types"... so _you_ said about changing what the compiler thinks a
pointer is, that wasn't my idea, I am just trying to roll with that train
of thought, and consider possibilities.

I don't think it's fair to accuse me of moving goal posts when it's
absolutely not clear where they are to begin with. I made the first reply
in this thread, at which point there was no conversation to go from.
I haven't declared any goal posts I'm aware of. I'm just humoring ideas,
and trying to adapt my thinking to your responses, and consider how it
affects my requirements. I'm also trying to read into vague comments about
practical usage that I can't properly visualise without examples.
This is a very important topic to me long-term. It is the source of
innumerable headache and mess throughout my career. D needs to get this
right.

I'm equally confused by your changing position on whether new is important,
should be deprecated, will support global 'new' overrides, assuming that
delete doesn't exist (rules out non-collecting allocators in conjunction
with the keyword allocators), allocation through an allocator object should
be encouraged as standard in user code, typed allocators will need to
integrate with (and occasionally replace) the global GC, or not.
Your position hasn't been static, I'm just trying to work out what it is,
and then consider if I'm happy with it.

You're implying I have a fixed position, I don't (I do have a few
implementation requirements I think are important though, however they end
out being expressed), I had no idea at all what you had in mind at the
start of the thread, I'm just trying to get clarification, and generally
throwing ideas in the pot.

     What I do hope to get to is to have allocators define their own
>>     pointers and reference types. User code that uses those will be
>>     guaranteed certain allocation behaviors.
>>
>>
>> Interesting, will this mangle the pointer type, or the object type being
>> pointed to? The latter is obviously not desirable. Does the former
>> actually work in theory?
>>
>
> I don't think I understand what you mean. Honest, it seems to me you're
> confused about what you want, how it can be done, and what moving pieces
> are involved.
>

Apparently I don't understand what you mean. What does "have allocators
define their own pointers and reference types" mean then?
I presumed you mean that an allocator would have some influence on the type
of the pointers they allocate, then it can be known how to handle them
throughout their life.

One example is Rust, which defines several pointer types to implement its
> scoping and borrowing semantics. I think it's not easy to achieve its
> purported semantics with fewer types, and time will tell whether
> programmers will put up with the complexity for the sake of the
> corresponding benefits.
>

I don't think rust allows user-defined allocators to declare new pointer
types, does it?

 I think a critical detail to keep in mind, is that (I suspect) people
>> simply won't use it if it doesn't interface with keyword 'new'.
>>
>
> I think this is debatable. For one, languages such as Java and C++ still
> have built-in "new" but quite ubiquitously unrecommend their usage in user
> code. Far as I can tell that's been a successful campaign.
>

I'm not sure what you mean... if by success you mean people don't use new
in C++, then yes, I have observed that pattern, and it's an absolute
catastrophe in my experience.
Every company/code base I've ever worked on has had some stupid new macro
they roll themselves. And that always leads to problems when interacting
with 3rd party libraries, and generic code in C++ can't work when there's
lots of different ways to allocate memory.

For two, there are allocations that don't even entail calls to new, such as
> array concatenation.


This is a very important area of discussion; how will user-defined
allocators fit into implicit allocations?
Shall we split off a few threads so they can be considered 'on topic'?
There are a lot of things that need discussion.

 It's extremely common that I want to enforce that a library exist
>> entirely within a designated heap. It can't fall back to the global GC.
>>
>> I work on platforms where memory is not unified. Different resources
>> need to go into different heaps.
>>
>
> The proposed API would make that entirely possible. You seem to care only
> with the appendix "and mind you I only want to use new for all of those!"
> which is a bit out there. But nevertheless good feedback.


You often "quote" people, but totally paraphrase them such that it's
nothing like what they said.
I'm sure I've made the point a whole bunch of times times that I care only
that "libraries don't hard-code an allocator which can not be overridden by
the application that uses them".
That may sound like "I want to use new for all those"; because libraries
invariably use new, and I can't control that. The reason for that is very
likely nothing more than 'because it's a keyword', it's also convenient,
and perceived as the standard/proper way to allocate memory.
I think that's water well under the bridge, probably in the ocean by now,
new is out there, I don't think it can be revoked.
So as far as I see it, unless you can present a practical solution to
deprecate it, and a path to migrate existing 'new' calls, then new must be
a part of the solution here; it must be overridable.

That seems to be the direction this thread is going anyway, so I'll drop
this thread now. But you can't accuse me of changing my story when I'm only
trying to clarify your story in the first place, which also seems to have
changed (in favour of keeping new judging by the direction of the thread,
and raising DIP46).
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.puremagic.com/pipermail/digitalmars-d/attachments/20130924/bb68a98b/attachment-0001.html>


More information about the Digitalmars-d mailing list