Escape analysis (full scope analysis proposal)

Wed Nov 12 16:45:02 PST 2008

Andrei Alexandrescu Wrote:
> But how do you type then the assignment example?
> 
> void assign(int** p, int * r) { *p = *r; }
> 
> How do you reflect the requirement that r's region outlives *p's region?
> 
> But that's not even the point. Say you define some notation, such as:
> 
> void assign(int** p, int * r) if (region(r) <= region(p));
> 
> But the whole point of regions was to _simplify_ notations like the 
> above into:
> 
> void assign(region R)(int*R* p, int *R r);
> 
> So although you think you simplified things by using region(symbol) 
> instead of symbolic names, you complicated things. The compiler still 
> needs to infer regions for each value, so it is as complicated as a 
> named-regions compiler, and in addition you require the user to write 
> bulkier expressions because you disallow use of symbols. So everybody is 
> worse off. Note how in the example using a symbolic region the outlives 
> relationship is enforced implicitly by using the same symbol name in two 
> places.

Examples such as this one are rare enough to afford the need for
annotations. I was under the impression that D was supposed to promote
the use of references over pointers. People working with low-level
code will probably either appreciate the optimization and correctness
checking, or can request a way to turn off compiler enforcement of
scoping in low-level code fragments.

> I suspect there are things you can't even express without symbolic 
> regions. Consider this example from Dan's slides:
> 
> struct ILst(region R1, region R2) {
>      int *R1 hd;
>      ILst!(R1, R2) *R2 tl;
> }
> 
> This code reflects the fact that the list holds pointer to integers in 
> one region, whereas the nodes themselves are in a different region. It 
> would be a serious challenge to tackle that without symbolic regions, 
> and simpler that won't be for anybody.

Transitive scope ownership ensures that a member of a structure outlives
the structure itself. In which case we can create a list in a local scope,
and either add objects allocated in that scope or any parent scope or
the heap. Referencing objects from child scopes would be incorrect and
I don't think it's unreasonable to expect the programmer to code around
such a desire.

foo*R*Q x, if (R in Q)
is illegal, because it could produce a dangling reference.

foo*R*Q x, if (Q in R)
is equivalent to foo*Q*Q, for the purpose of:
*x = y;
where y is one of foo*R, foo*Q or foo*global

A problem arises for other operations though:
foo*R*Q might have different semantics than foo*Q*Q
when being on the right-hand side of the assignment.
y = *x;
is legal for foo*R y, but not for foo*Q y.

Therefore, while the lifetime must always stay constant
or be reduced towards the right side of the type declaration.
It's necessary to be able to explicitly relax restrictions
towards the left.

The problem is that the type syntax is suited for scope
relaxation rules to be transitive, not scope restriction.
Ie. global(foo*)* makes sense, when * is scoped by default,
but scope(foo*)* doesn't make sense, when * is global
by default.

So we could either implement it with regions, which I'm
not a big fan of (better than nothing though!);
or ditch "scope" (as a restriction) in favor of "global"
and maybe "scopeof()" (as a relaxation).

Hopefully soon D2 and the book will be done and
the development of D3 can start, and such a breaking
change can be introduced.

> But a possible path is to make arrays safe and leave pointers for those 
> cases in which efficiency is of utmost importance. With luck, those 
> cases are rare.

Safe sure, but not by fobidding the usage of stack arrays.
Let's try to keep D performance competitive with C++, not C#. :P