Escape analysis (full scope analysis proposal)

Andrei Alexandrescu SeeWebsiteForEmail at erdani.org
Mon Nov 3 18:19:24 PST 2008


Steven Schveighoffer wrote:
> "Andrei Alexandrescu" wrote
>> * Disallow pointers in SafeD.
> 
> Isn't this already the case?

At a point we wanted to allow pointers in restricted ways.

> BTW, slightly OT, I read Bartosz' article on digitalmars about SafeD.  This 
> isn't an implemented language right?  Is the plan for D to become SafeD?  Or 
> is there going to be a compiler switch?  Or something else maybe?  I've 
> heard SafeD mentioned a lot on this NG, without ever really knowing how it 
> exists (concrete or theory).

It's planned as a compiler switch and module option. Essentially SafeD 
is slated to be a safe, proper, well-defined subset of D. It was 
Bartosz's idea, and IMHO an important dimension of D's development. 
Walter is implementing module safety options like this:

module(safe) mymodule;

which means the module must always be compiled with safety on. On the 
contrary,

module(system) mymodule;

means the module is getting its hands greasy.

>> * Make all ref parameters scoped by default. There will be impossible for 
>> a function to escape the address of a ref parameter without a cast. I 
>> haven't proved it to myself yet, but I believe that if pointers are not 
>> used and with the amendments below regarding arrays and delegates, this 
>> makes things entirely safe. In Walter's words, "it buttons things pretty 
>> tight".
> 
> I think this sounds reasonable.  However, will there be a way to override 
> this behavior?  For example, some modifier to signify that a reference is 
> not scope?  The advantage to having the other be the default is that the 
> scope keyword already exists.

Good point. I think escaping the address of a ref should be allowed via 
a cast.

> Having to cast for every time I convert to a pointer will be unpleasant, but 
> not horrific.  I'd prefer to state one time 'this is an unsafe reference', 
> preferrably in the signature, and be able to use it like before.  The same 
> semantics still apply as far as calling the function, it just says "the 
> author of this function knows what he is doing" to the compiler.

Currently Walter plans to do that at module granularity.

> You would also disallow this keyword usage in SafeD which would be easy to 
> filter.
> 
> noscope would be a good keyword...
> 
>> * Make this a reference so that it obeys what references obey.
> 
> This is one place where I think whole-heartedly it should be done.  One 
> rarely needs the address this, in fact, I generally end up returning *this 
> quite a bit in struct operators, so this change will be most welcome.
> 
>> * If people want to implement e.g. linked lists, they should do it with 
>> classes. Implementing them with structs will require casts to obtain and 
>> escape &this. That also means they'd be using pointers, so anything goes - 
>> pointers are not restricted from escaping.
> 
> I implemented dcollections' node-based containers (tree, hash, linked list) 
> as structs, because I wanted to control the allocation of them.  I agree 
> with others that the defacto standard is going to be structs, since 
> performance is paramount, and you have little need for OOP in the internal 
> node structures.
> 
> Also, if the noscope (or equivalent keyword) is implemented as above, you 
> can easily decorate your pointer-using functions:
> 
> struct LinkNode(T)
> {
> noscope
> {
>     LinkNode *find(T value);
>     LinkNode *findReverse(T value);
>     ...
> }
> }
>> * There are two cases in which things escape without the user explicitly 
>> using pointers: delegates and dynamic arrays initialized from 
>> stack-allocated arrays.
>>
>> * For delegates require the scope keyword in the signature of the callee. 
>> A scoped delegate cannot be stored, only called or passed down to another 
>> function that in turn takes a scoped delegate. This makes scope delegates 
>> entirely safe. Non-scoped delegates use dynamic allocation.
> 
> If noscope (or equivalent keyword) is used, can we make scope the default? 
> I'd much rather have the default be the higher-performance, more commonly 
> used option.

I think safety should be the default. People who care about efficiency 
will be willing to write a little bit more. I agree that this is 
annoying if that's the more frequent situation.

> Also, when you say stored, do you mean stored anywhere, or stored anywhere 
> but the stack?  Because there is no harm in storing a scope delegate in a 
> local variable (as long as it is also scope).

That could be allowed, but probably it's not really needed.

>> * We don't have an idea for dynamic arrays initialized from 
>> stack-allocated arrays.
> 
> Hm... this is a tough one.  At the very least, you can disallow returning 
> such arrays, as long as the compiler can prove the arrays origins.  That 
> should cover 90% of the issues.
> 
> The other 10% are ones that are passed into functions.  You might employ the 
> same techniques as for delegates, but then we are stuck with the same 
> problems as needed for full escape analysis.  Plus the need to return a 
> slice of an array is much greater than the need to return a delegate.
> 
> You could also argue that an array contains a pointer, and morphing into a 
> dynamic array is the same as taking the address of a stack local variable 
> (which would require a cast).  But that means SafeD cannot use dynamic 
> arrays to reference static arrays.  However, you can then argue that dynamic 
> arrays allocated using new are OK for SafeD because you didn't take the 
> address of a local stack variable.  My understanding is that in SafeD, 
> safety trumps performance.
> 
> Note that a static array could be used for a rebindable reference, since it 
> has a rebindable pointer in it, so it is really an unsafe operation:
> 
> int[2] a;
> 
> int[] aref = a[0..1]; // reference to a[0]
> aref = a[1..2]; // rebind to a[1]

I agree with the above. The floor is open for more ideas.


Andrei



More information about the Digitalmars-d mailing list