Borrowing and Ownership
Timon Gehr
timon.gehr at gmx.ch
Sun Oct 27 22:36:30 UTC 2019
I finally got around to writing up some thoughts on @safe borrowing and
ownership in D. I didn't spend nearly enough time on this post, so the
details of this proposal might not be optimal yet, and it is likely to
miss a few details. The TLDR is that `scope` pointers and built-in
references should behave like Rust borrowed pointers. (Except lifetimes
will be tracked through function calls and data structures a lot less
precisely, at least initially.) The meaning of `T*` should not change
from what it is today.
First, note that even though there is a lot of confusion around this,
`@safe` is currently not inherently broken. It provides memory safety
(modulo implementation bugs in the compiler). The problem we want to
solve is that @safe code does not support exposing direct references
into the guts of data structures that use memory management schemes
other than tracing GC. @trusted is currently broken, however (see
further below in this post).
Basic assumptions:
- We want to start with simple rules that ensure memory safety of
slightly more expressive @safe code instead of comprehensive ones that
ensure both safety and very high expressiveness. (I have more ambitious
ideas than what I discuss here, but I doubt those are realistic for D
right now.)
- With DIP 1021 accepted, `scope` is headed to mean controlled lifetime
without mutable aliasing. (`ref` implies `scope`).
- Tracing GC is a successful way to write @safe programs and should be
continued to be supported as an option.
In particular, @live is a dead end, because:
- It either provides no guarantees or it breaks memory safety of @safe code.
- It wants to change the meaning of `T*` based on a function attribute.
- It breaks D programs that want to use the GC.
The next steps should instead be roughly as follows:
Clarify the meaning of `T*` in impure `@safe` code:
- A non-`scope` built-in pointer in impure `@safe` code points to a
value whose lifetime (e.g. a GC pointer or a pointer into the data
segment) and unrestricted aliasing. The same holds true for non-`scope`
class references. This is true today, but should be explicitly stated in
the language specification to prevent confusion.
- In @system code, `T*` is a pointer with arbitrary lifetime, and
@trusted code needs to ensure @safe code cannot access a `T*` whose
lifetime may be less than the last possible time that @safe code might
access the pointer.
Improve `@trusted`:
- The problem with `@trusted` is that it has no defense against `@safe`
code destroying its invariants or accessing raw pointers that are only
meant to be manipulated by `@trusted` code. There should therefore be a
way to mark data as `@trusted` (or equivalent), such that `@safe` code
can not access it.
Change the meaning of `scope`:
- `scope` should apply to all types of data equally, not only built-in
pointers and references. The most obvious use case for this is @safe
interfacing with a C library that exposes handles as structs with an
integer field but specifies undefined behavior if those handles are
mismanaged. Not everything that is a manually-managed reference to
something is a built-in pointer or reference.
- Non-immutable non-scope values may not be assigned to `scope` values.
In particular, non-`immutable` `scope` member functions cannot accept a
non-`scope` receiver. This is necessary, because otherwise you
immediately break the aliasing guarantee DIP 1021 aims to introduce.
- `scope` on a struct does not imply its fields are `scope`. (It is
perfectly fine to store a GC pointer within something with a scoped
lifetime.)
- Fields can be `scope`. `scope` fields cannot be accessed through a
non-`scope` receiver. The lifetime of `scope` fields ends when the
lifetime of the enclosing object ends.
- `scope` has to be a type constructor.
- A non-`scope` pointer cannot be dereferenced if that would yield a
`scope` value. (However, such a `scope` value can be moved somewhere
else through a non-scope pointer.)
Add borrowing rules:
- When copying a mutable `scope` value to another mutable `scope` value,
access to the original value has to be disabled until the copy's
lifetime ends.
- When copying a mutable `scope` value to a `const` `scope` value, the
original value has to become `const` until the copy's lifetime ends.
- When copying a `const` `scope` value to a `const` `scope` value, the
original value only has to outlive the copy.
- In particular, when taking the address of a value on the stack, the
resulting `scope`d pointer will restrict access to that variable
according to those rules until its lifetime ends. The `return`
annotation can be used to track such assignments through function calls.
- For stack values, data flow analysis can be used to detect values that
can be temporarily promoted to `scope`. Overloaded functions should
prefer the `scope` overload.
Example: Library implementation of Unique pointers with @safe borrowing
(`const`/`immutable`/`class` interactions left out for simplicity):
---
struct Unique(T){
@trusted private scope T* payload;
@disable this(this);
auto borrow()@trusted return{ // (`return` refers to `ref this`)
// potentially many references to unique pointer exist,
// need runtime check
// here, we'll just temporarily null out the Unique reference.
static struct Borrowed{
@trusted private scope Unique!T* self;
@trusted private scope T* payload;
@disable this(this);
~this()@trusted{ self.payload=payload; }
return scope(T*) borrow()@trusted scope{
return payload;
}
alias borrow this;
}
auto borrowed=payload;
payload=null;
return scope(Borrowed)(&this,borrowed);
}
scope(T*) borrow()@trusted scope return{
// only one reference to unique pointer exists,
// just return payload
// note that while this does not actually return
// a reference to `this`, we want the calling `@safe`
// code to treat it as if it did, so that this can be
// a `@trusted` function
return payload;
}
~this(){
destroy((()@trusted=>payload)());
()@trusted{
free(payload);
payload=null;
}
}
alias borrow this; // enable implicit borrowing
}
Unique!T makeUnique(T,A...)(A args){
auto p=malloc(...);
...;
return Unique!T(p);
}
---
---
void main(){
auto p=makeUnique!int(3);
++*p; // ok, p is temporarily promoted to `scope` and `++` is
// evaluated on a borrowed p.
{
scope Unique!int* q=[p].ptr;
++*p; // error, p is borrowed by q
}
++*p; // ok, q went out of scope
Unique!int* q=[p].ptr; // ok
++*p; // ok
// however, this line used the non-scope overload of `borrow` as
// `p`can no longer be promoted to `scope`
auto r=q; // ok
++**q; // ok
static void foo(ref int x, Unique!int* y){
assert((*y).borrow() is null); // reference disabled temporarily
++x; // ok
}
foo((*q).borrow(),r);
foo((*r).borrow(),q);
}
---
Similar strategies work for manually-allocated arrays and reference
counting.
For @safe reference counting for mutable payloads, there always needs to
be a runtime check on borrow, similar to the first implementation of the
`borrow` function above. This could be implemented by reserving a bit in
the reference count for keeping track of such mutable borrows. To enable
both const and mutable borrows, one would probably need two reference
counts, one for normal references and one for const borrows. (Note that
Rust uses similar runtime checks for safe reference counting.)
The main drawback of this proposal is that it doesn't separate control
of lifetime and control of aliasing, doing so would however require
adding another type qualifier and does not have precedent in Rust.
More information about the Digitalmars-d
mailing list