Escape analysis (full scope analysis proposal)

Thu Oct 30 05:14:53 PDT 2008

On 2008-10-29 11:01:35 -0400, "Steven Schveighoffer" 
<schveiguy at yahoo.com> said:

> This is exactly the kind of thing I DON'T want to have.  Here, you have to
> specify everything, even though the compiler is also doing the work, and
> making sure it matches.  Tack on const modifiers, shared modifiers, and pure
> functions and there's going to be more decorations on function signatures
> than there are parameters.

I agree that this is becomming a problem, even without scope. What we 
need is good defaults so that you don't have to decorate most of the 
time, and especially when you want to bypass it.

I'd also like to point out that beside the possibility of better 
optimization and error catching by the compiler, specifying more 
properties function interfaces can free us of handling other releated 
things. With "immutable" values you don't need to worry about 
duplicating them everywhere to avoid other references from changing it; 
with "shared", you'll have less to worry about thread synchronization; 
and with "scope" as I proposed, you no longer have to worry about 
providing variables with the correct scope as the compiler can 
dynamically allocate when it sees the variable is needed outside of the 
current scope.

Basically, by documenting better the interfaces in a machine-readable 
way, we are freed of other burdens the compiler can now take care of. 
In addition, we have better defined interfaces and the compiler has a 
lot more room to optimize things.

> Note that especially this scope stuff will be required more often than the
> others.

Indeed.

> I'd much rather have either no checks, or have the compiler (or a lint tool)
> do all the work to tell me if anything escapes.

The problem is that as soon as you have a function declaration without 
the body, the lint tool won't be able to tell you if it escapes or not. 
So, without a way to specify the requested scope of the parameters, 
you'll very often have holes in your escape analysis that will 
propagate down the caller chain, preventing any useful conclusion.

For instance:

	void foo()
	{
		char[5] x = ['1', '2', '3', '4', '\0'];
		bar(x);
	}

	void bar(char* x)
	{
		printf(x);
	}

	void printf(char* x);

Here you have no specification telling you that printf won't keep a 
reference to x beyond its scope, so we have to expect that it may do 
so. Turns out that because of that, a compiler or lit tool can't deduce 
if bar may or not leak the reference beyond its scope, which basically 
mean that calling bar(x) in foo may or may not be safe. With my 
proposal, it'd become this:

	void foo()
	{
		char[5] x = ['1', '2', '3', '4', '\0'];
		bar(x.ptr);
	}

	void bar(scope char* x)
	{
		printf(x);
	}

	void printf(scope char* x);

And here the compiler, or the lint tool, can see that x doesn't need to 
live outside of foo's scope and that all is fine. If bar decided to 
keep the pointer in a global variable for further use, then the 
function signature would have a noscope x or the assignment to a global 
wouldn't work, and once bar has a noscope argument then foo won't 
compile unless x is allocated on the heap.

I don't think it's bad to force interfaces to be well documented, and 
documented in a format that the compiler can understand to find errors 
like this.

-- 
Michel Fortin
michel.fortin at michelf.com
http://michelf.com/