auto ref escaping local variable

Tue Jan 24 18:32:02 PST 2017

On Tuesday, January 24, 2017 11:16:21 Ali Çehreli via Digitalmars-d wrote:
> On 01/24/2017 02:03 AM, Jonathan M Davis via Digitalmars-d wrote:
>  > On Tuesday, January 24, 2017 00:47:31 Ali Çehreli via Digitalmars-d

> Obviously, I know all of that and they are pretty complicated for new
> programmers.
>
> I just can't imagine what the semantics of a function could be. Do you
> have an example? So, we're talking about a function that will mutate its
> argument but the caller sometimes doesn't care. Oh, this sounds like
> functions from the C era, which take null when the caller does not care.
>
> So, is this the guideline? "Make the argument 'auto ref' when you have
> something to return in addition to the return value." If so, it's
> sub-obtimal because the 'auto ref' doesn't have the opportunity of
> bypassing operations like the C function could:
>
>      if (arg) {
>          // Do expensive operation
>      }
>
> If I guessed the semantics right, non-const 'auto ref' does not have
> that luxury.

In general, I think that the guideline is to not bother with ref at all if
you're not explicitly trying to get a value back. If you have a struct
that's expensive enough to copy around that you need ref, then maybe it
shouldn't be a struct on the stack. And if you care about optimizing stuff
enough to use ref to avoid copies, then you should understand it well enough
to understand the consequences of using it.

In general, I think that the place for auto ref is when you're trying to
forward ref-ness like when you're wrapping a range, and you want the
ref-ness of the wrapped front to be passed on to the wrapper range _if_ it
returns by ref, but you don't want it to be ref if it's not ref, and you
don't want to do a bunch of static ifs to make it work for both.

And if you're looking to have a function that accepts both rvalues an
lvalues, auto ref makes sense so long as the function doesn't mutate its
arguments. Adding const is nice in that it then gurantees that it doesn't,
but it's so restrictive that it usually makes no sense for generic code (at
least not if it's dealing with arbitrary types as opposed to a specific
group of known types like all integer types). So, using auto ref in place of
const& in C++ makes sense so long as you're willing to be careful about not
mutating the argument, and const auto ref _can_ make sense, but it's
restrictive enough that it probably doesn't.

And as you indicated, using auto ref with a function that either might or
will mutate its argument is likely to be rarely useful. It makes sense when
you're looking to pass ref-ness along (which makes sense in some generic
code but most code isn't going to want to do that), and it makes sense if
you're paranoid about unnecessary copies and are willing to force the caller
to make a copy if they don't want their variable mutated when it's passed in
(since then copying only happens if the caller makes it happen), but that
puts an unusual and arguably error-prone burden on the caller. So, in
general, I would expect that auto ref would be used when either passing
along refness or when the programmer wanted an equivalent to const& and was
willing to risk mutation occuring by accident.

Skipping auto ref and manually overriding the function like you were
suggesting doesn't fix any of these complications though. It just makes them
more explicit (and thus possibly more clear to the programmer if they don't
understand auto ref enough), and it makes it so that the functions can be
virtual. Aside from when you need to pass on ref-ness, what you want in
principle is auto ref const, but const is just too restrictive to work in
the general case. So, I can't possibly recommend to anyone that they start
slapping const on function parameters by default, auto ref or not.

But all of these complications are part of why I would simply recommond
_not_ using ref unless you specifically _want_ the argument to be mutated
and that's part of the function's API or if you know what you're doing and
know that you need to avoid the cost of the copy. And just don't have
structs that are expensive to copy. Phobos already tends to assume that -
especially for ranges. And it's so incredibly easy to accidentally copy an
object if you mess up with ref that relying on getting ref right doesn't
seem like a great solution in general. Also, D has move semantics built into
the language, making it so that expensive copying is not as big a problem in
D as it is in C++ (particularly C++98).

So, I would start by just not using ref, and if profiling indicated that I
had a struct that was too expensive to be copying around, I would then look
at either putting it on the heap and avoiding the whole problem or using ref
and auto ref to avoid copies, but then I'm taking upon myself the burden of
making sure that I get ref right enough that I don't end up with unintended
copies too frequently.

> const is still engrained in my programming mind due to long exposure to
> C and C++. I guess D is proving that it's not that essential to be
> const-correct. This is similar to how private is not as strong and in
> some cases public is the default.

const is great in principle, but it is so restrictive in D as to be
borderline useless. If you're just dealing with built-in types, it works
reasonably well. But as soon as you have user-defined types and
indirections, then life gets disgusting fast. Postblit constructors don't
work with const. Ranges don't work with const. const tends be viral in that
once something is const, you can't get anything non-const out of it, and
it's difficult to do anything like tail-const outside of arrays, which the
language understands well enough that it makes tail-const work for them.
Ref-counting doesn't work with const. If your container is const (or your
reference to the container is const), it's going to be really hard to get a
range over that container - even more so if you want the range to detect
when the container is mutated out from under it and protect you like an
iterator would in Java. The list of stuff that doesn't work with const just
piles up as your program becomes more complicated.

In theory, we might be able to fix some of these problems - like if we could
figure how to get postblit constructors to work with const or if we could
figure out some way for a range to indicate how it could be converted to a
tail-const variant of itself - but once you have transitive const, it locks
everything down way more than occurs in C++, and I think that a number of
the problems with const are simply insurmountable as long as const has no
backdoors. You're basically getting immutable but without the benefits.

So, I'm all for using const where it works, but I'm not at all in a hurry to
slap it on anything where the types aren't well-known, and it's the sort of
thing that I expect to have to be removed at some point if I start using it
on user-defined types. And if you're using ranges, then const pretty much
goes out the window right there. So, while I would love to be able to use
const more (if fact, the whole reason that I started off with D2 back in
2008 rather than D1 was because D2 had const, and D1 didn't), experience has
shown that D's const is simply too restrictive to be useful in the general
case. If you try very hard to use it, you can use it, but odds are that
you're simply not going to be able to use it on most code, and I'm almost to
the point that I simply wouldn't bother with it aside from making local
variables of built-in types const. I do still try and use it on member
functions where it's clear that it will work, but increasingly, I just don't
bother.

- Jonathan M Davis