Eliminate class allocators and deallocators?

Wed Oct 7 15:04:03 PDT 2009

== Quote from Andrei Alexandrescu (SeeWebsiteForEmail at erdani.org)'s article
> dsimcha wrote:
> > == Quote from Andrei Alexandrescu (SeeWebsiteForEmail at erdani.org)'s article
> >> dsimcha wrote:
> >>> == Quote from Andrei Alexandrescu (SeeWebsiteForEmail at erdani.org)'s article
> >>>> It is a bad idea because distinguishing between release of (expensive)
> >>>> resources from dangerous memory recycling is the correct way to obtain
> >>>> deterministic resource management within the confines of safety.
> >>> This is based on two faulty assumptions:
> >>>
> >>> 1.  Memory is cheap.  (Not if you are working with absurd amounts of data).
> >>> 2.  Garbage collection is never a major bottleneck.  (Sometimes it's a
worthwhile
> >>> tradeoff to add a few manual delete statements to code and sacrifice some safety
> >>> for making the GC run less often.)
> >> malloc.
> >> Andrei
> >
> > Kludge.  Requires using two separate heaps (inefficient) and worrying about
> > whether your stuff is manually freed on all code paths, not just the ones that are
> > executed often enough for performance to matter.
> Au contraire, once the GC heap becomes safe, I have less to worry about.
> Andrei

If you're that concerned about making the GC heap safe, here's a less destructive
(to other people's programming styles) way to do it:

1.  Make delete only call the d'tor and not release memory.  (I'm fine with this
provided the stuff below is done.)

2.  Add a std. lib convenience function to core.memory that does what delete does
now (calls d'tor AND frees memory).  For the purposes of this discussion, we'll
call it deleteFree().  There's already a std. lib. function that just frees
memory, GC.free().  Keep it.

3.  If you really insist on absolute heap safety even at the expense of
performance, grep your code and get rid of all deleteFree() and GC.free() calls.

Frankly, I consider the ability to manually free GC allocated memory to be a HUGE
asset for the following reasons, which I've mentioned before but would like to
distill:

1.  GC is usually the best way to program, but can be a huge bottleneck in some
corner cases.

2.  Maintaining two separate heaps (the manually memory managed C heap and the
GC'd D heap) is a massive and completely unacceptable kludge because:

1.  If you just want to delete a few objects to make the GC run less often, you
can just add delete statements for the common code paths, or paths where the end
of an object's lifetime is obvious.  You then just let the GC handle the less
common code paths or cases where object lifetimes are non-trivial and gain tons of
simplicity for only a small performance loss.  If you have to handle all the odd
code paths manually too, this is when bugs really start to seep in.

2.  Heaps have overhead.  Two heaps have twice the overhead.

3.  addroot(), etc. is a PITA *and* adds yet another place where you have to lock
on the GC mutex.  Half the need for manual memory management in D is because the
GC sometimes scales poorly to large numbers of threads.  This would definitely not
help the situation.

4.  Using the C heap whenever you want the ability to manually free something
doesn't play nicely w/ builtin language features such as classes, arrays,
associative arrays, etc., or objects returned from library functions.

Because of these 4 issues, I feel that only being allowed to do manual memory
management if you use the C heap is such an unacceptably bad kludge that it is for
many practical purposes akin to not being allowed to do manual memory management
at all.  This is unacceptable in a systems/performance language.

Remember, performance/systems languages can't place excessive emphasis on safety
and absolutely MUST assume the programmer knows what he/she is doing.  If you want
Java, you know where to find it.