Thoughts about 'ref' and 'out' return values

Zach the Mystic reachBUTMINUSTHISzach at gOOGLYmail.com
Sun Feb 17 10:53:52 PST 2013


On the thread for DIP25, http://wiki.dlang.org/DIP25 , I 
introduced the possibility of using 'out' return values to ensure 
memory safety. The system is very simple, and I think it is 
pretty good.

ref int giveRef(ref int a) {...}
out int giveOut(ref int a) {...}

Any given expression stores a locality bit, its value being 
either 'local' or 'global'. The rule is that you can never escape 
a reference to a local either by returning it or by assigning it 
to a global variable. An function returning 'out' is assumed to 
give a global result. A function returning 'ref' is as local as 
its most local 'ref' parameter was, with the default being 
global, when there are no reference parameters. 'out' implies 
'ref' when the return type has value semantics, but not when it 
has reference semantics. (This might be a good rule to apply 
generally to 'ref', i.e. 'ref' promotes to the first non-value 
type, doing nothing if you're already there, but it's probably 
too late for that, and it's a completely different issue.)

An 'out' returning function implicitly marks all of its reference 
parameters as locals, whereas a 'ref' returning function treats 
them as globals, unless they are marked 'scope'. This simple 
system ignores the difference between a 'scope' parameter, which 
cannot be escaped at all, and a 'ref' parameter which is simply 
not allowed to be returned. For example:

static int* golum;

ref int giveRef(ref int a)
{
   golum = &a; // This is unsafe, but allowed
}

out int giveOut(ref int a)
{
   golum = &a; // Illegal, a has been implicitly 'scope'-ed.
}

ref int giveScope(scope int a)
{
   golum = &a; // Illegal, a is explicitly scoped
}

To make the system fully safe, you'd have to store two bits, one 
for local/global, and one for returnable/not-returnable. But I'm 
going to ignore this corner case on the assumption that it is 
rare. I want to push using a single 'local/global' 
bit-per-expression as far as I can.

'out' return values won't need to appear very often, but any time 
you want 'new' data or some kind of reference to a global, they 
will promote their return value to the status it needs. The main 
goal is to make 'ref' safe while granting flexibility to cases 
where it's hard to analyze safety just based on 'ref' alone.

The rules for returning a reference are simple, as reflects a 
system which stores but one bit for tracking an expression.

ref int giveRef(ref int x)
{
   return x; // Pass

   int y;
   return y; // Fail, y is local
}

out int giveOut(ref int x)
{
   return x; // Fail, x has been marked 'scope'
}

With struct's and classes, the member functions take hidden 
pointers to their data, so functions which return references must 
be assumed to be local unless proven otherwise. With an 'out' 
return value, it is proved easily, when the function is compiled:

struct Example
{
   int x;
   static int total;

   // With the following function, the result is as
   // local as the instance it gets called with
   ref int get() { return x; }

   // The 'out' applies to getTotal's hidden parameter as well
   out int getTotal()
   {
     total += x;
     return total; // pass: static int is global storage

     // compile-time error, hidden parameter marked 'scope'
     return x;
   }
}

This doesn't come up much, because a non-static member function 
doesn't usually return global data, but it is possible to do with 
'out' returns, and it makes static verification extremely easy 
for the compiler.

The rules for deriving locality from an expression are pretty 
simple, and have nothing to do with 'ref', really. All 'out' and 
'scope' do are assign 'local' to the 'ref' parameters.

1) Any address is as local as the structure it is copied from.

void one()
{
   int x;
   int* y = &x; // y is local because x is

   y = new int; // y is global

   static int z;
   y = &z; // y is global

   Test t;
   static struct Test {
     int _x;
     ref int get() { return _x; }
   }

   t.get; // reference return is as local as t is
}

2) Locals may not be escaped

ref int two(ref int refPam, scope int scopePam)
{
   int x;
   return x; // Fail
   return scopePam; // Fail
   return refPam; // Pass

   static int* y;
   y = &x; // Fail
   y = &scopePam; // Fail
   y = &refPam; // Pass unsafely, but solution is complicated
}

Flow analysis is required for keeping track of a variable which 
constantly switches from local to global in branching statements. 
I believe that tracking this stuff perfectly will be too 
expensive for the compiler and better left for analyzeD or some 
such device.


More information about the Digitalmars-d mailing list