Escape analysis (full scope analysis proposal)

Mon Nov 3 11:47:25 PST 2008

"Robert Jacques" wrote
> On Sat, 01 Nov 2008 12:00:10 -0400, Steven Schveighoffer 
> <schveiguy at yahoo.com> wrote:
>> "Michel Fortin" wrote
>>>> The only way I can see to solve this is to do it at link time.  When 
>>>> you
>>>> link, piece together the parts of the graph that were incomplete, and 
>>>> see
>>>> if
>>>> they all work.  It would be a very radical change, and might not even
>>>> work
>>>> with the current linkers.  Especially if you want to do shared 
>>>> libraries,
>>>> where the linker is builtin to the OS.
>>>
>>> I think you're dreaming... not that it's a bad thing to have ambition, 
>>> but
>>> that's probably not even possible.
>>
>> Sure it is ;)  You have to write a special linker.
>>
>> I think everyone who thinks a scope decoration proposal is going to 1) 
>> solve
>> all scope escape issues and 2) be easy to use is dreaming :P
>
> Various research languages have shown both 1 and 2 are possible.

I think 1 can be possibly done.  2 is a matter of subjectivity, and so far, 
I haven't seen an example of it.

But I also don't want D to become a purely academic language.  I want it to 
keep the system-level performance and usability that drew me to it in the 
first place.

>>>>> I don't think it's bad to force interfaces to be well documented, and
>>>>> documented in a format that the compiler can understand to find errors
>>>>> like this.
>>>>
>>>> I think this concept is going to be really hard for a person to 
>>>> decipher,
>>>> and really hard to get right.
>>>
>>> It takes some thinking to get the prototype right at first. But it takes
>>> less caution calling the function later with local variables since the
>>> compiler will either issue an error or automatically fix the issue by
>>> allocating on the heap when an argument requires a greater scope.
>>
>> I hope to avoid this last situation.  Having the compiler make decisions 
>> for
>> me, especially when heap allocation occurs, is bad.
>
> How so? Please explain why it's bad (an opinion by itself isn't and 
> argument).

Allocating on the heap involves locking a global mutex (as long as the heap 
is global), searching for a free memory space, possibly running a garbage 
collection cycle, and finally possibly allocating more memory from the OS.

All of these are very expensive compared to adjusting the stack pointer.

For instance, I wrote a 'chunk allocator' which uses D's allocator to 
allocate memory in chunks instead of going to the GC for each piece in 
dcollections' implementation.  Doing this achieved at least a 2x speedup 
because I was calling on the GC less often.  The author of Tango's new 
container implementation wrote a similar allocator that's even faster than 
that because it doesn't use the GC for any allocation (of course, you cannot 
use it to allocate items which have references, because the GC doesn't look 
at that memory).

In Tango, many operations rely on using stack allocation for buffers and 
temporary classes.  If the compiler decides I don't know what I'm doing and 
helpfully allocates those on the heap for my protection, I just lost all the 
performance that I purposely build the library to have.  This is one of the 
main arguments I hear from the other Tango devs about moving to D2, the 
automatic dynamic closure.

I think many people are not aware of how important it is to avoid heap 
allocation when possible.  It is one of the central goals that makes Tango 
so much faster than other libraries.

-Steve