Escape analysis (full scope analysis proposal)

Mon Nov 3 13:01:27 PST 2008

"Andrei Alexandrescu" wrote
> Steven Schveighoffer wrote:
>> "Andrei Alexandrescu" wrote
>>> Steven Schveighoffer wrote:
>>>> I think everyone who thinks a scope decoration proposal is going to 1) 
>>>> solve all scope escape issues and 2) be easy to use is dreaming :P
>>> I think that's a fair assessment. One suggestion I made Walter is to 
>>> only allow and implement the scope storage class for delegates, which 
>>> simply means the callee will not squirrel away a pointer to delegate. 
>>> That would allow us to solve the closure issue and for now sleep some 
>>> more on the other issues.
>>
>> If scope delegates means trust the coder knows what he is doing (in the 
>> beginning), I agree with that plan of attack.
>
> It looks like things will move that way. Bartosz, Walter and I talked a 
> lot yesterday about it - a lot of crazy things were on the table! The next 
> step is to make this a reference, which is highly related to escape 
> analysis. At the risk of anticipating a bit an unfinalized design, here's 
> what's on the table:
>
> * Continue an "anything goes" policy for *explicit* pointers, i.e. those 
> written explicitly by user code with stars and stuff.
>
> * Disallow pointers in SafeD.

Isn't this already the case?

BTW, slightly OT, I read Bartosz' article on digitalmars about SafeD.  This 
isn't an implemented language right?  Is the plan for D to become SafeD?  Or 
is there going to be a compiler switch?  Or something else maybe?  I've 
heard SafeD mentioned a lot on this NG, without ever really knowing how it 
exists (concrete or theory).

>
> * Make all ref parameters scoped by default. There will be impossible for 
> a function to escape the address of a ref parameter without a cast. I 
> haven't proved it to myself yet, but I believe that if pointers are not 
> used and with the amendments below regarding arrays and delegates, this 
> makes things entirely safe. In Walter's words, "it buttons things pretty 
> tight".

I think this sounds reasonable.  However, will there be a way to override 
this behavior?  For example, some modifier to signify that a reference is 
not scope?  The advantage to having the other be the default is that the 
scope keyword already exists.

Having to cast for every time I convert to a pointer will be unpleasant, but 
not horrific.  I'd prefer to state one time 'this is an unsafe reference', 
preferrably in the signature, and be able to use it like before.  The same 
semantics still apply as far as calling the function, it just says "the 
author of this function knows what he is doing" to the compiler.

You would also disallow this keyword usage in SafeD which would be easy to 
filter.

noscope would be a good keyword...

> * Make this a reference so that it obeys what references obey.

This is one place where I think whole-heartedly it should be done.  One 
rarely needs the address this, in fact, I generally end up returning *this 
quite a bit in struct operators, so this change will be most welcome.

> * If people want to implement e.g. linked lists, they should do it with 
> classes. Implementing them with structs will require casts to obtain and 
> escape &this. That also means they'd be using pointers, so anything goes - 
> pointers are not restricted from escaping.

I implemented dcollections' node-based containers (tree, hash, linked list) 
as structs, because I wanted to control the allocation of them.  I agree 
with others that the defacto standard is going to be structs, since 
performance is paramount, and you have little need for OOP in the internal 
node structures.

Also, if the noscope (or equivalent keyword) is implemented as above, you 
can easily decorate your pointer-using functions:

struct LinkNode(T)
{
noscope
{
    LinkNode *find(T value);
    LinkNode *findReverse(T value);
    ...
}
}

> * There are two cases in which things escape without the user explicitly 
> using pointers: delegates and dynamic arrays initialized from 
> stack-allocated arrays.
>
> * For delegates require the scope keyword in the signature of the callee. 
> A scoped delegate cannot be stored, only called or passed down to another 
> function that in turn takes a scoped delegate. This makes scope delegates 
> entirely safe. Non-scoped delegates use dynamic allocation.

If noscope (or equivalent keyword) is used, can we make scope the default? 
I'd much rather have the default be the higher-performance, more commonly 
used option.

Also, when you say stored, do you mean stored anywhere, or stored anywhere 
but the stack?  Because there is no harm in storing a scope delegate in a 
local variable (as long as it is also scope).

> * We don't have an idea for dynamic arrays initialized from 
> stack-allocated arrays.

Hm... this is a tough one.  At the very least, you can disallow returning 
such arrays, as long as the compiler can prove the arrays origins.  That 
should cover 90% of the issues.

The other 10% are ones that are passed into functions.  You might employ the 
same techniques as for delegates, but then we are stuck with the same 
problems as needed for full escape analysis.  Plus the need to return a 
slice of an array is much greater than the need to return a delegate.

You could also argue that an array contains a pointer, and morphing into a 
dynamic array is the same as taking the address of a stack local variable 
(which would require a cast).  But that means SafeD cannot use dynamic 
arrays to reference static arrays.  However, you can then argue that dynamic 
arrays allocated using new are OK for SafeD because you didn't take the 
address of a local stack variable.  My understanding is that in SafeD, 
safety trumps performance.

Note that a static array could be used for a rebindable reference, since it 
has a rebindable pointer in it, so it is really an unsafe operation:

int[2] a;

int[] aref = a[0..1]; // reference to a[0]
aref = a[1..2]; // rebind to a[1]

-Steve