Escape Analysis on reddit

Andrei Alexandrescu SeeWebsiteForEmail at erdani.org
Sat Nov 1 09:02:47 PDT 2008


Michel Fortin wrote:
> On 2008-10-31 23:51:45 -0400, Andrei Alexandrescu 
> <SeeWebsiteForEmail at erdani.org> said:
> 
>>> The problem is, unless you allow fully specifying dependencies, there 
>>> is going to be ways people call functions that you didn't think of, 
>>> which are legitimate designs, and just can't be done with your new 
>>> regime.  The ultimate result is either people cast around the system, 
>>> don't use the system, or end up allocating unecessarily on the heap 
>>> to work around it.
>>
>> It all depends on how often those cases are encountered.
> 
> Tell me how often you encounter a function that swap two variables then.
> 
> Basically, it's implemented like this:
> 
>     void swap(scope ref int a, scope ref int b)
>     {
>         int tmp = a;
>         a = b; b = tmp;
>     }
> 
> All fine, no reference is escaping. Now change the parameters for 
> pointers or object references. You could naively implement it like this 
> for pointers:
> 
>     void swap(scope ref int* a, scope ref int* b)
>     {
>         scope int* tmp = a;
>         a = b; b = tmp;
>     }

"scope" is not needed inside the function (in both cases). The compiler 
will infer it.

> But if swap compiles like this, it's dangerous because the scope for *a 
> may be different from the scope for *b and you may leak a pointer to a 
> narrower scope in a broader one.

Aha! Good point. But notice, however, that you are trying to extract 
more juice from "scope" than there is in it. What scope says in swap is 
only that swap itself does not escape any of its parameters. It doesn't 
tell that the parameters have actually the same scope.

> For instance:
> 
>     // broader scope (global)
>     int a = 1;
>     int* pa = &a;
> 
>     void test()
>     {
>         // narrower scope (local)
>         int b = 2;
>         int* pb = &b;
>         swap(pa, pb); // should be an error
>     }
> 
> Here, swap(pa, pb), if allowed, would attempt to put a pointer to the 
> local variable b into the global variable pa. After the function 
> returns, pointer pa becomes invalid. To avoid this, you need the 
> signature for swap to tell you a more complext scope constrain. Using 
> Robert Jacques's suggested notation, I think it should be like:
> 
>     void swap(scope ref int* a, scope ref int* b)
>         if ((*a).scope <= b.scope && (*b).scope <= a.scope)
>     {
>         scope int* tmp = a;
>         a = b; b = tmp;
>     }

This would require each variable to carry its actual scope either at 
compilation or at runtime. Either would be very hard to implement if at 
all. But I'm glad there is a bit of exaggeration on the side of safety :o).

> Basically, if you're giving b a pointer to the value pointed by a, the 
> value pointed by a needs to be alive as long or longer than b if you 
> want to avoid having b point to invalid memory. Same for a and the value 
> pointed by b.
> 
> Robert's notation is good for expressing that in the function signature, 
> but I don't like it because it places constrains on the signature, not 
> on the type, making constrains propagation a lot more complicated.
> 
> So I've been thinking of this notation:
> 
>     void swap(scopeof(b*) ref int* a, scopeof(*a) ref int* b)
>     {
>         scopeof(b) int* tmp = a;
>         a = b; b = tmp;
>     }

That's nice, but it would reduce what swap can do. Oftentimes it is not 
known whether values manipulated by a function are in the same scope or not.

> Note that tmp here needs to be scopeof(a): this guarenties that whatever 
> you put in tmp, you can copy in b later (since scopeof(a) == scopeof(*b)).
> 
> Oh, and if you don't about the type of tmp, including the scoping 
> constrains, you should be allowed to substitute it all with "auto":
> 
>     void swap(scopeof(b*) ref int* a, scopeof(*a) ref int* b)
>     {
>         auto tmp = a;
>         a = b; b = tmp;
>     }
> 
> - - -
> 
> That's enough for the swap function. Now lets take a look at a plain 
> setter function, which, I'm pretty sure, is a common pattern.
> 
> Let's try this first:
> 
>     struct S
>     {
>         int *pi;
>     }
> 
>     void setPI(scope S s, scope int* pi)
>     {
>         s.pi = pi;
>     }
 >
> Simple enough you think? But we have the same problem: the scope of the 
> value pointed by pi may be narrower than the one s is in:

I am glad there is awareness of the limitations of scope. There are 
still things that are outside scope's charter, and the summary is - 
scope tells something about the function declaring it, not about a 
relationship between its parameters.


Andrei



More information about the Digitalmars-d mailing list