DIP25/DIP1000: My thoughts round 2
Chris M.
chrismohrfeld at comcast.net
Sun Sep 2 05:14:58 UTC 2018
Round 2 because I had this whole thing typed up, and then my
power went out on me right before I posted. I was much happier
with how that one was worded too.
Basically I'd like to go over at length one of the issues I see
with these DIPs (though I think it applies more to DIP1000),
namely return parameters and what we could do to make them
stronger. I will say I do not have the chops to go implement
these ideas myself, even if I had approval and support. This is
more to get my thoughts out there and see what other people think
about them (frankly I'd be putting this in Study if it wasn't a
ghost town over there).
First I'm going to reiterate over DIP25 as I understand it for
background, stealing some examples from the DIP page. Let's
starting with the following.
ref int id(ref int x) {
return x; // pass-through function that does nothing
}
ref int fun() {
int x;
return id(x); // escape the address of local variable
}
The id() function just takes and returns a variable by ref, which
is perfectly legal. However it is open to abuse. As you see in
fun(), id() is used to escape a reference to a local variable,
which is obviously not desired behavior. The issue is how do we
tell fun(), from id()'s signature alone, "id() will return a
reference to whatever you pass it, one way or another. Make sure
you don't give id()'s return value to something that'll outlive
the argument you pass to id()" (though we need to say this in
more concise terms obviously). DIP25 solves this pretty nicely
with return parameters
// now this function is banned, since it has a ref parameter and
returns by ref
ref int wrongId(ref int x) {
return x; // ERROR! Cannot return a ref, please use "return
ref"
}
// this is fine however
ref int id(return ref int x) {
return x;
}
ref int fun() {
int x;
static int y;
return id(x); // no, wait, since we're returning to a scope
that'll outlive x, this errors at compile-time. Thanks return ref
return id(y); // fine, sure, y lives forever
}
fun() now knows the return value of id() cannot outlive the
argument it passes to id(). This allows us to disallow certain
undesired behavior at compile-time, which is great.
With that in mind, let's move on to DIP1000. Namely, I'm looking
at this issue Walter filed.
https://issues.dlang.org/show_bug.cgi?id=19097
I'll try to detail it here (and steal more examples, thanks Mike
:*) ). It has to do with the same principles I outlined above for
DIP25, only this time we're using pointers rather than refs.
First example, which works as expected
int* frank(return scope int* p) { return p; } // basically id()
void main()
{
// lifetimes end in reverse order from which they are declared
int* p; // `p`'s lifetime is longer than `i`'s
int i; // `i`'s lifetime is longer than `q`'s
int* q; // `q`'s lifetime is the shortest
q = frank(&i); // ok because `i`'s lifetime is longer than
`q`'s
p = frank(&i); // error because `i`'s lifetime is shorter
than `p`'s
}
frank() marks its parameter as return, to signal to main() that
wherever main() puts frank()'s return value, it can't outlive
what main() passed as an argument to frank(). All fine and dandy.
Second example (I'd pay closer attention to betty()'s definition
here)
void betty(ref scope int* r, return scope int* p)
{
r = p; // (1) Error: scope variable `p` assigned to `r` with
longer lifetime
}
void main()
{
int* p;
int i;
int* q;
betty(q, &i); // (2) ok
betty(p, &i); // (3) should be error
}
Hang on, why can't I compile betty(), when it's doing the same
thing as frank(), only putting the return value in the first
parameter rather than returning it? No reason, I absolutely
should be able to compile and use betty(). So the question
becomes, how can betty() tell main(), that what main() passes as
the first argument to betty() can't outlive what's passed as the
second argument? Marking the second parameter return does not
work here, as that only ties its lifetime to the return value. It
can't be used on arbitrary parameters. How to resolve this?
Walter's solution is as follows. If a function is void, and its
first parameter is ref, apply the "return" annotation to the
first parameter rather than the return value of the function.
Using these conditions, betty() now compiles, and main() errors
at (3) as expected. However I find this solution too restrictive.
While it fits many functions within Phobos, we are tying users to
this special case and forcing them to unnecessarily refactor
their code around it. What if I don't want it to be void and want
the function to return something as well? What if I want to
return via the second parameter? This just seems to be setting up
another trap for users to fall into.
I talked about this in the "Is @safe still a work-in-progress?"
thread, but I'll repeat it here again. There is a cleaner way to
do this. I'll demonstrate using some borrowed Rust syntax, but
remember the syntax doesn't matter too much here so much as the
idea. Rather than using "return", we instead annotate the
parameters like so
void betty(ref scope int*'a r, scope int*'a p) // okay it's not
pretty
{
r = p; // cool, p's lifetime is tied to r's lifetime
}
void main()
{
int* p;
int i;
int* q;
betty(q, &i); // (2) ok
betty(p, &i); // (3) error
}
Good, these are the results I expect. What if I want to output to
the second parameter?
void betty(scope int*'a r, ref scope int*'a p)
{
p = r; // cool, p's lifetime is tied to r's lifetime
}
void main()
{
int* p;
int i;
int* q;
betty(&i, q); // (2) ok
betty(&i, p); // (3) error
}
Nice, that'll work too
Here's frank()
int*'a frank(scope int*'a p) { return p; } // basically id()
void main()
{
// lifetimes end in reverse order from which they are declared
int* p; // `p`'s lifetime is longer than `i`'s
int i; // `i`'s lifetime is longer than `q`'s
int* q; // `q`'s lifetime is the shortest
q = frank(&i); // ok because `i`'s lifetime is longer than
`q`'s
p = frank(&i); // error because `i`'s lifetime is shorter
than `p`'s
}
These annotations are much more flexible since they can be moved
any which way around the function signature, and have the added
benefit of visually tying together lifetimes. For further
consistency it could also be extended back to DIP25
ref'a int id(ref'a int x) {
return x;
}
Hopefully that was coherent. Again this is me for me to get my
thoughts out there, but also I'm interested in what other people
think about this.
More information about the Digitalmars-d
mailing list