An important potential change to the language: transitory ref

Sat Mar 20 05:04:15 PDT 2010

Andrei Alexandrescu Wrote:

> On 03/20/2010 12:56 AM, Steven Schveighoffer wrote:
> > On Sat, 20 Mar 2010 00:07:55 -0400, Andrei Alexandrescu
> > <SeeWebsiteForEmail at erdani.org> wrote:
> >> I remember you brought up a similar point in a related discussion a
> >> couple of years ago. It's a good point, and my current understanding
> >> of the matter is that functions that take and return ref could and
> >> should be handled conservatively.
> >
> > I don't like the sound of that... What I fear is that the compiler will
> > force people to start using pointers because refs don't cut it. I'm
> > guessing you mean you cannot return ref returns from other functions?
> > That breaks abstraction principles, I should be able to delegate a task
> > to a sub-function.
> 
> Perhaps it means you can't return ref returns from other functions if 
> you pass them references to local state.
> 
> (I've read a paper at some point about a program analysis that stored 
> for each function the "return pattern" - a mini-graph describing the 
> relationship between parameters and result. If it rings a bell to 
> anyone... please chime in.)

This would be full escape analysis :)  I agree it is the best solution, but it requires D to have special object files and a special linker.

> 
> >>> For instance, try to find a rule that prevents the above from compiling,
> >>> but allows the following to compile.
> >>>
> >>> struct S
> >>> {
> >>> private int x;
> >>> ref int getX() { return x;}
> >>> }
> >>>
> >>> struct T
> >>> {
> >>> S s;
> >>> ref int getSX() { return s.x; }
> >>> }
> >>
> >> In the approach discussed with Walter, S is illegal. A struct can't
> >> define a method to return a reference to a direct member. This is
> >> exactly the advice given in Scott's book for C++. (A class can because
> >> classes sit on the heap.)
> >
> > A struct may sit on the heap too.
> 
> Yes. For those cases you can always use pointers, which are not subject 
> to the restrictions I envision for ref.
> 
> It's a very small inconvenience. For example, if you have a linked list 
> struct, you may feel constrained that you can't do:
> 
> struct List {
>      List * next;
>      List * prepend(List * lst) {
>          lst.next = &this;
>          return lst;
>      }
> }
> 
> In my approach, &this is illegal. And actually for a good reason. This 
> code bombs:
> 
> List iForgotThePointer() {
>      List lst;
>      lst.prepend(new List);
>      return lst;
> }
> 
> My response to the above issue is two-pronged:
> 
> (a) For List a class would be an alternative

Note, this is impractical if you care about performance (I know, because originally, dcollections used classes for link nodes).

> (b) To work with pointers to structs use static member functions and 
> pointers instead of methods and references

This prevents something like a linked list from being used in safe D.  That might be too much of a restriction.  Yes, it makes code safer, but it makes safe D unusable.

> The goal is worth pursuing, so let's keep on thinking of how to make it 
> work. If D manages to define demonstrably safe encapsulated containers, 
> that would be an absolutely huge win.

I agree.

> 
> > Here's another case:
> >
> > struct S
> > {
> > int*x;
> > ref int getX() {return *x;}
> > }
> >
> > Is x on the heap or not? How do you know? Arrays are just a wrapped
> > pointer, so they too could be stack allocated.
> 
> struct S
> {
>      int*x;
>      static ref int getX(S * p) {return *p.x;}
> }
> 
> In an ideal world, if you have your hands on a pointer to a struct, you 
> should be reasonably certain that that lives on the heap. It would be 
> just great if D could guarantee that.

again, not in safe D.  Anytime pointers enter the mix, safe D is disqualified, no?

> 
> > Consider this:
> >
> > void foo(ref int x)
> > {
> > x++;
> > }
> >
> > struct S
> > {
> > int x;
> > int y;
> > bool xisy;
> > ref int getX() {if(xisy) return y; return x;}
> > }
> >
> > foo(S.x);
> > foo(S.getX());
> 
> Hm, I assume the two lines refer to an object of type S. The example 
> above would again have to be rewritten in terms of static functions with 
> pointers.

Yes, I meant to write:
S s;
foo(s.x);
foo(s.getX());

> 
> > Another case:
> >
> > struct S
> > {
> > int x;
> > ref S opUnary(string op)() if (op == "++") {++x; return this;}
> > }
> >
> > I feel this should all be possible.
> 
> I think opUnary should return void and the compiler should worry about 
> that result being used.

Yes, you are probably right.  It would be cool if the compiler could automatically do that in all cases where the operator is expected to return a reference to the struct, i.e. +=, -=, etc.  That removes one of my biggest concerns.

> 
> > ------
> > counter proposal:
> >
> > What about having a new kind of ref that can only be passed up the
> > stack, or down only one level if you are the one who initiated it.
> >
> > Call it scope ref:
> >
> > ref int baz(ref y)
> > {
> > return y;
> > }
> > scope ref int foo(scope ref int x, ref int y)
> > {
> > //return x; // illegal, we did not make x scope ref
> > //return baz(x); // illegal, cannot convert scope ref into ref
> > return y; // legal, you can convert a ref parameter into scope ref.
> > }
> >
> > scope ref int bar()
> > {
> > int y;
> > //return foo(y, y); //illegal, you cannot pass scope refs down the stack
> > more than one level
> > }
> >
> > At least this leaves ref alone to be used without restrictions that the
> > compiler can't prove are necessary. If we find scope ref is the only
> > kind of ref we ever use, then maybe we can get rid of scope ref and just
> > make ref be the restricted form. Or you could keep scope ref and reserve
> > ref for only provable heap-variables.
> >
> > Man, it would be nice to have escape analysis...
> 
> It sure would, but it quickly gets into the interprocedural tarpit.

It requires some sort of inter-object analysis, like you mentioned at the top.

> 
> Your idea is good, except I don't see why not make ref scoped ref. After 
> all ref is currently not an enabler - it could be missing from the 
> language; pointers are fine. So why not make ref do something actually 
> interesting?

The whole point was to run an experiment seeing how many things could be written as scoped ref instead of just ref (leaving ref the way it is today).  If they all can with minor adjustments, then we can switch scoped ref to simply ref.  If some can't, and those also don't make sense as pointers, then we have a predicament that ref enables some sort of code that isn't enabled by pointers or scoped ref.  My point was, I don't really know off the top of my head all the cases for using ref, and you probably don't either.  I don't think a sweeping change like this is good without at least some practical evidence.

-Steve