Escape analysis (full scope analysis proposal)

Michel Fortin michel.fortin at michelf.com
Tue Nov 4 03:36:20 PST 2008


On 2008-11-03 14:47:25 -0500, "Steven Schveighoffer" 
<schveiguy at yahoo.com> said:

>>> I hope to avoid this last situation.  Having the compiler make decisions
>>> for me, especially when heap allocation occurs, is bad.
>> 
>> How so? Please explain why it's bad (an opinion by itself isn't and argument).
> 
> Allocating on the heap involves locking a global mutex (as long as the heap
> is global), searching for a free memory space, possibly running a garbage
> collection cycle, and finally possibly allocating more memory from the OS.
> 
> All of these are very expensive compared to adjusting the stack pointer.

I won't dispute this. I'll note that the upcomming "shared" keyword may 
help regarding not locking a global mutex for unshared variables, but 
even without the mutex the operation still is expensive.


> For instance, I wrote a 'chunk allocator' which uses D's allocator to
> allocate memory in chunks instead of going to the GC for each piece in
> dcollections' implementation.  Doing this achieved at least a 2x speedup
> because I was calling on the GC less often.  The author of Tango's new
> container implementation wrote a similar allocator that's even faster than
> that because it doesn't use the GC for any allocation (of course, you cannot
> use it to allocate items which have references, because the GC doesn't look
> at that memory).

Nothing of the sort should be prevented by a scoping system. If it is, 
then I'd consider the system a failure.


> In Tango, many operations rely on using stack allocation for buffers and
> temporary classes.  If the compiler decides I don't know what I'm doing and
> helpfully allocates those on the heap for my protection, I just lost all the
> performance that I purposely build the library to have.  This is one of the
> main arguments I hear from the other Tango devs about moving to D2, the
> automatic dynamic closure.

Then we must make sure the compiler doesn't heap allocate when it 
doesn't absolutely need to. And, *in addition*, when the programmer 
really needs to be sure that a variable is not heap-allocated, marking 
a varialbe "scope" would do the trick.


> I think many people are not aware of how important it is to avoid heap
> allocation when possible.  It is one of the central goals that makes Tango
> so much faster than other libraries.

I agree with your first assertion (and am not enough familiar with 
Tango to say anything about the second) and this is exactly why I'm in 
favor of the compiler deciding what to heap-allocate. People are not 
aware enough of how important it is to avoid heap allocation, so I 
expect that if the compiler can be made to know about scopes, it can 
avoid heap allocation where many users wouldn't bother (especially in a 
garbage-collected language where you can heap-allocate without 
thinking), which would result in faster programs with fewer bugs all 
this without having to think about the technical details.

Note that I may be wrong with this, but there's no way to be sure 
without trying. Anyway, once we have a proper scoping system, it'll be 
easy to try and decide between auto-allocation and simply enforcing 
constrains by emitting errors.


-- 
Michel Fortin
michel.fortin at michelf.com
http://michelf.com/




More information about the Digitalmars-d mailing list