Escape Analysis on reddit

Andrei Alexandrescu SeeWebsiteForEmail at erdani.org
Fri Oct 31 20:51:45 PDT 2008


Steven Schveighoffer wrote:
> "Andrei Alexandrescu" wrote
>> Steven Schveighoffer wrote:
>>> "Walter Bright" wrote
>>>> http://www.reddit.com/r/programming/comments/7akzd/escape_analysis/
>>> The question is, in the example:
>>>
>>> int* bar(int* p) { return p; }
>>> int* foo()
>>> {
>>>     int x = 3;
>>>     return bar(&x);
>>> }
>>>
>>> where is the problem?  Is the problem in bar, or is the problem in foo?
>>>
>>> The real problem is in foo, because there's technically nothing wrong 
>>> with bar.
>> The problem is that foo does not know enough about bar to deem the call 
>> correct. Notice that if you consider local functions instead of ints, that 
>> gets you straight to the delegates case that is the crux of the matter.
> 
> No, foo doesn't know anything.  The compiler is the one who does not know. 
> The user does, so it looks to the user like the compiler has a bug (which I 
> would argue is correct).

Well obviously. When I said "foo doesn't know" I really meant "the 
compiler while compiling foo and enjoying its horizon". I'm at a 
continuous disadvantage because I don't have the time to give full 
detail in my posts so they come off as unclear.

>>> So what ends up happening in your scheme is this:
>>>
>>> int *bar(int *p) { return p;}
>>>
>>> void foo(scope int *p)
>>> {
>>>    int *x = bar(p); // no escape, but compile error
>>> }
>> Well I was talking to Walter how in compiling a function there are two 
>> scopes to track: that of parameters, and that of locals inside functions. 
>> The former includes the latter. Bar should be defined as:
>>
>> scope int* bar(scope int* p) { return p; }
>>
>> which is easy to check as long as some conservativeness is acceptable.
> 
> So how does foo know what scope return means?  Consider a min function:
> 
> scope T min(T)(scope T v1, scope T v2);
> 
> Now, let's try it out:
> 
> scope int *foo(scope int *p)
> {
>    int x;
>    return min(&x, p);
> }

That won't compile per my explanation: scope in a function parameter is 
different from scope in a local variable. You can't return a scoped 
local. So the code won't compile.

> Shouldn't this not compile?  If you say 'well, the compiler can take the 
> most conservative approach, and assume scope means the most inner scope.'
> 
> OK

In this case it's not even conservative, it's doing the correct thing. 
To illustrate the effects of a conservative approach, consider:

scope int *foo(scope int *p)
{
    int x;
    auto r = &x;
    r = p;
    return r;
}

This function won't compile if the compiler employs flow-insensitive 
escape analysis (see e.g. 
http://en.wikipedia.org/wiki/Data_flow_analysis) because r will be 
marked as escaping regardless of order of statements.

Walter plans to implement path- and flow-insensitive escape analysis. In 
fact he first wants to define scope to work on credit alone (trust the 
programmer) and later on implement the analysis. I hope I'll be able to 
talk him out of that, because it's a precisely backwards approach: he'll 
first allow vale tudo, and then break code when he implements a limited 
analysis. I think the right thing is to move from conservative to 
permissive.

> scope T[] findFirst(scope T[] buf, scope T target);
> 
> scope T[] findFirst5(scope T[] buf)
> {
>    scope t5 = new T(5);
>    return findFirst(buf, t5);
> }

Same thing, won't compile. This is a clear example where conservative 
doesn't cut it.

There is a paper that I don't have the time to look for, that does a 
pointer analysis and associate with each function a little graph showing 
how the return value depends on the inputs. (Does anyone know of it?) I 
don't think we can afford to implement anything near that level of 
sophistication.

> The problem is, unless you allow fully specifying dependencies, there is 
> going to be ways people call functions that you didn't think of, which are 
> legitimate designs, and just can't be done with your new regime.  The 
> ultimate result is either people cast around the system, don't use the 
> system, or end up allocating unecessarily on the heap to work around it.

It all depends on how often those cases are encountered.

>>> So yes, you are protected, but at the cost of not being able to write 
>>> code that is provably safe.  The end result is you end up removing scope 
>>> where it really should be, and then virally, it makes you remove scope 
>>> decorations everywhere else.  I predict that if this is implemented, 
>>> nobody will be able to use it.
>> There are many cases in which statically typed programs disallow valid 
>> constructs. Each case is a matter of nuance, but by and large types have 
>> turned out to be successful.
> 
> I would bet that either A) the proposed system would not provide correct 
> protection, or B) the system would require many design changes in a library 
> like Tango, which uses lots of stack variables and inner functions to achive 
> higher performance.  It just doesn't work with actual already successful 
> software programs.  It's too limited.

It would be helpful to provide some examples from Tango or other 
libraries to substantiate this.

>>> I'll reiterate my challenge from another thread:
>>>
>>> void f(ref int *a, int *b, int *c) { if(*b < *c) a = b;  else a = c;}
>>>
>>> struct S
>>> {
>>>    int *v;
>>> }
>>>
>>> int *f2(S* s) { return s.v;}
>>>
>>> void f3(ref int *a, ref int *b, ref int *c)
>>> {
>>>    int *tmp = a;
>>>    a = b; b = c; c = tmp;
>>> }
>>>
>>> void f4()
>>> {
>>>    int a=1, b=2, c=3;
>>>    auto ap = &a;
>>>    auto bp = &b;
>>>    auto cp = &c;
>>>    S s(cp);
>>>
>>>    f(ap, bp, cp);
>>>    b = *f2(&s);
>>>    f3(ap, bp, cp);
>>> }
>>>
>>> Please tell me how I should mark f, S, f2, and f3 such that f4 compiles. 
>>> Note that this code all works in D1 and provably has no problems with 
>>> escaping values, since f4 returns void, and no globals are accessed.
>> First off, we need to clarify that "provably" is a bit strong, and that 
>> the hypothesis "f4 returns void, and no globals are accessed" is not all 
>> that's needed for the proof. Full access to function bodies is necessary.
> 
> OK, if f4 returns void, and takes no arguments, and none of the code used by 
> any of the functions accesses globals (I admit, I may have said this 
> ambiguously last time), there's not even any heap activity, you tell me how 
> it can escape.  Prove me wrong.  You have full access to the source as you 
> can see.

That's exactly what I said. Without access to the bodies you can't tell 
anything. And I automatically discount access to bodies. Information 
should be inferred from function signatures.

>> Correctness may be provable to a human or machine having access to the 
>> entire body of code, but not if e.g. f, f2, and f3 are invisible to f4.
> 
> That is the challenge.  If I remove the function bodies, how do you mark the 
> signatures so you can still prove that it's correct?

Instead of looking on an artificial example, it would be better to work 
on a corpus of examples extracted from actual code.

>> Second, I contest the entire example. Static typing is supposed to prove 
>> interesting properties about *most* correct programs, at the cost of 
>> disallowing a *few* correct programs that are not very frequent, are not 
>> very interesting, or can be easily replaced with typesafe alternatives. 
>> One can write correct code based on void* that is virtually impossible to 
>> typecheck statically, but that doesn't show that static types are useless.
> 
> My example is at least as interesting as Walter's in the article to show the 
> problem.  If you want to complain about simple examples not being useful, 
> then stop using simple examples to show your point.  We are not talking 
> about mucking with void *'s or drastic violations of types here, we are 
> talking about reference parameters for returns, or an accessor of a member, 
> or a 3way swap function.  These are all common designs.  The fact that you 
> cannot find a solution so you resort to just bluntly calling my example 
> 'uninteresting' speaks for itself.

This is the second time my identity is conflated with that of Walter. I 
do not write his blog entries! Moreover, I also think they are due for 
an increase in quality of treatment. About two months ago I told Walter 
that his writing is overly imprecise and glossing over a lot of detail. 
To me it conveys the feeling that he considers the reader not too smart 
and not too critical and just willing to accept a lot of statements at 
face value. If I were tasked with writing on the same subjects, I'd make 
damn sure I could stand to lose a toe for every incorrect word in there. 
(Newsgroup standards aren't that high, hallelujah.) He rightly said in 
his defense that a blog entry can't be an in-depth treatment for the 
time he can afford to invest in it. My opinion of the whole banana is 
that if you want to make an authoritative blog that's also short and 
sweet you must end up pondering each word carefully and provide ample 
references for the interested.


Andrei



More information about the Digitalmars-d mailing list