About ref used for performance reasons with struct

kinke noone at hotmail.com
Tue Feb 12 05:34:09 PST 2013


On Monday, 11 February 2013 at 14:54:48 UTC, kinke wrote:
> I'd propose a small change so that suited structs are passed 
> transparently byref only if the parameter is not mutable, e.g., 
> for a function foo(const BigStruct s). The compiler would 
> therefore not need to analyze the code flow in the callee to 
> determine if the parameter is modified and hence a copy is 
> needed.
>
> The compiler would nevertheless need to be quite smart though:
>
> ---
> struct MyBigStruct { double a, b, c; }
> double foo(const MyBigStruct s, double* x)
> {
>     *x = s.a;
>     return s.b;
> }
> // naive optimization by byref passing
> double foo_ref(const ref MyBigStruct s, double* x)
> {
>     *x = s.a;
>     return s.b;
> }
>
> MyBigStruct s = { 1, 2, 3 };
> double* x = &s.b;
> auto bla = foo(s, x); // returns 2; now s.b = 1
> s.b = 2;              // reset s.b to 2
> bla = foo_ref(s, x);  // returns 1! (the new s.b = 1)
> ---
>
> So the compiler would need to prove there is no way the 
> argument can be modified and handle tricky aliasing issues 
> accordingly.

Thinking about it some more, I'm fairly convinced the proposal in 
this thread is just too dangerous. Proving the argument passed 
transparently byref cannot be modified is just too complex imo; 
e.g., it could be modified by the callee via aliasing issues 
shown in the example above, or it could be modified by another 
thread while the callee is running, hence modifying the callee's 
parameter at the same time!

I think the intended move semantics (byval passing for small 
structs w/o postblit constructor/destructor, otherwise byref) are 
only really safe if the argument is an rvalue - the rvalue is 
guaranteed not to be used after the call, so potential 
modifications are not visible for the caller, and the rvalue is 
guaranteed not to be used simultaneously in another thread. 
Certain lvalue cases could be optimized as well, e.g., if the 
lvalue is a private variable of the caller (local variable or 
parameter) AND is not used after the call.

---
struct MyBigStruct { double a, b, c; }

double bar(MyBigStruct s) { return s.a; }

double foo(MyBigStruct s) // s is an lvalue (parameter)
{
   return bar(s); // s NOT used afterwards => byref passing
}

void main()
{
   MyBigStruct s; // s is an lvalue (local variable)
   invoke foo(s) in another thread; // s used afterwards => byval
   if (...)
   {
     foo(s);      // s used afterwards (next line) => byval
     s.a += 1;
   }
   else
     foo(s);      // s NOT used afterwards => byref
}
---

So imo parameters need to be denoted by something special like 
'auto ref' if a copy is to be elided for performance reasons - 
not in its current form though (only for templates and leading to 
code-bloating); instead, rvalues should simply be transformed to 
lvalues before the call and then passed byref just like ordinary 
lvalues. And as I've already pointed out a few times in recent 
discussions, I'm not a fan of 'const auto ref' for not-mutable 
parameters, so I'd very much like to see 'const ref' for these 
(allowing rvalues too, just like C++).
The function signature therefore clearly indicates that these 
params are references to the caller's arguments, with all 
potentially dangerous implications.


More information about the Digitalmars-d mailing list