Escape Analysis on reddit

Sat Nov 1 10:01:21 PDT 2008

On 2008-11-01 12:02:47 -0400, Andrei Alexandrescu 
<SeeWebsiteForEmail at erdani.org> said:

> Michel Fortin wrote:
>> Basically, it's implemented like this:
>> 
>>     void swap(scope ref int a, scope ref int b)
>>     {
>>         int tmp = a;
>>         a = b; b = tmp;
>>     }
>> 
>> All fine, no reference is escaping. Now change the parameters for 
>> pointers or object references. You could naively implement it like this 
>> for pointers:
>> 
>>     void swap(scope ref int* a, scope ref int* b)
>>     {
>>         scope int* tmp = a;
>>         a = b; b = tmp;
>>     }
> 
> "scope" is not needed inside the function (in both cases). The compiler 
> will infer it.

Indeed.

>> But if swap compiles like this, it's dangerous because the scope for *a 
>> may be different from the scope for *b and you may leak a pointer to a 
>> narrower scope in a broader one.
> 
> Aha! Good point. But notice, however, that you are trying to extract 
> more juice from "scope" than there is in it. What scope says in swap is 
> only that swap itself does not escape any of its parameters. It doesn't 
> tell that the parameters have actually the same scope.

But my point was that the "no escape" guarenty doesn't hold if the two 
arguments' scope doesn't match. The parameter may escape via the other 
parameter.

>> For instance:
>> 
>>     // broader scope (global)
>>     int a = 1;
>>     int* pa = &a;
>> 
>>     void test()
>>     {
>>         // narrower scope (local)
>>         int b = 2;
>>         int* pb = &b;
>>         swap(pa, pb); // should be an error
>>     }
>> 
>> Here, swap(pa, pb), if allowed, would attempt to put a pointer to the 
>> local variable b into the global variable pa. After the function 
>> returns, pointer pa becomes invalid. To avoid this, you need the 
>> signature for swap to tell you a more complext scope constrain. Using 
>> Robert Jacques's suggested notation, I think it should be like:
>> 
>>     void swap(scope ref int* a, scope ref int* b)
>>         if ((*a).scope <= b.scope && (*b).scope <= a.scope)
>>     {
>>         scope int* tmp = a;
>>         a = b; b = tmp;
>>     }
> 
> This would require each variable to carry its actual scope either at 
> compilation or at runtime. Either would be very hard to implement if at 
> all. But I'm glad there is a bit of exaggeration on the side of safety 
> :o).

I wouldn't call it exageration. Adding complexity to function 
signatures only to be half-safe isn't very appealing to me.

>> Basically, if you're giving b a pointer to the value pointed by a, the 
>> value pointed by a needs to be alive as long or longer than b if you 
>> want to avoid having b point to invalid memory. Same for a and the 
>> value pointed by b.
>> 
>> Robert's notation is good for expressing that in the function 
>> signature, but I don't like it because it places constrains on the 
>> signature, not on the type, making constrains propagation a lot more 
>> complicated.
>> 
>> So I've been thinking of this notation:
>> 
>>     void swap(scopeof(b*) ref int* a, scopeof(*a) ref int* b)
>>     {
>>         scopeof(b) int* tmp = a;
>>         a = b; b = tmp;
>>     }
> 
> That's nice, but it would reduce what swap can do. Oftentimes it is not 
> known whether values manipulated by a function are in the same scope or 
> not.

The caller function should know, and from there it could be possible to 
check against the constrain.

About reducing what swap can do, I'd like to see

>> That's enough for the swap function. Now lets take a look at a plain 
>> setter function, which, I'm pretty sure, is a common pattern.
>> 
>> Let's try this first:
>> 
>>     struct S
>>     {
>>         int *pi;
>>     }
>> 
>>     void setPI(scope S s, scope int* pi)
>>     {
>>         s.pi = pi;
>>     }
>> 
>> Simple enough you think? But we have the same problem: the scope of the 
>> value pointed by pi may be narrower than the one s is in:
> 
> I am glad there is awareness of the limitations of scope. There are 
> still things that are outside scope's charter, and the summary is - 
> scope tells something about the function declaring it, not about a 
> relationship between its parameters.

So basically, if you have two scope arguments and you assign some part 
of the first to the other, you risk bypassing the scope-checking system 
despite the appearances to the contrary. I'm not sure it's a good idea 
to have a scope-checking system if you can only trust in half.

-- 
Michel Fortin
michel.fortin at michelf.com
http://michelf.com/