Proposal for design of 'scope' (Was: Re: Opportunities for D)
via Digitalmars-d
digitalmars-d at puremagic.com
Thu Jul 10 13:10:36 PDT 2014
I've been working on a proposal for ownership and borrowing since
some time, and I seem to have come to a very similar result as
you have. It is not really ready, because I keep discovering
weaknesses, and can only work on it in my free time, but I'm glad
this topic is finally addressed. I'll write about what I have now:
First of all, as you've already stated, scope needs to be a type
modifier (currently it's a storage class, I think). This has
consequences for the syntax of any parameters it takes, because
for type modifiers there need to be type constructors. This
means, the `scope(...)` syntax is out. I suggest to use template
instantiation syntax instead: `scope!(...)`, which can be freely
combined with the type constructor syntax:
`scope!lifetime(MyClass)`.
Explicit lifetimes are indeed necessary, but dedicated
identifiers for them are not. Instead, it can directly refer to
symbol of the "owner". Example:
int[100] buffer;
scope!buffer(int[]) slice;
Instead of lifetime intersections with `&` (I believe Timon
proposed that in the original thread), simply specify multiple
"owners": `scope!(a, b)`. This works, because as far as I can see
there is no need for lifetime unions, only intersections.
A problem that has been discussed in a few places is safely
returning a slice or a reference to an input parameter. This can
be solved nicely:
scope!haystack(string) findSubstring(
scope string haystack,
scope string needle
);
Inside `findSubstring`, the compiler can make sure that no
references to `haystack` or `needle` can be escape (an
unqualified `scope` can be used here, no need to specify an
"owner"), but it will allow returning a slice from it, because
the signature says: "The return value will not live longer than
the parameter `haystack`."
// fixed-size arrays (new syntax of Kenji's PR)
string[$] text = "Old McDonald had a farm.";
auto sub = findSubstring(text, "had");
// typeof(sub) is scope!text(string),
// `haystack` gets substituted by `text`
assert(sub == "had a farm".);
Have multiple parameters? No problem:
scope!(a,b)(string) selectOneAtRandom(
scope string a,
scope string b
);
// => a _and_ b will outlive return value
For methods, `scope!this` can be used to. It's really no
different from other parameters, as `this` is just a special
implicit parameter.
There is also a nice extension: `scope!(const owner)`. This
means, that as long as the value designated as such live, `owner`
will be treated as const.
An interesting application is the old `byLine` problem, where the
function keeps an internal buffer which is reused for every line
that is read, but a slice into it is returned. When a user
naively stores these slices in an array, she will find that all
of them have the same content, because they point to the same
buffer. See how this is avoided with `scope!(const ...)`:
struct ByLineImpl(Char, Terminator) {
private:
Char[] line;
// ...
public:
// - return value must not outlive `this` (i.e. the range)
// - as long as the return value exists, `this` will be const
@property scope!(const this)(Char[]) front() const {
return line;
}
void popFront() { // not `const`, of course
// ...
}
// ...
}
void main() {
alias Line = const(char)[];
auto byline = stdin.byLine();
foreach(line; byline) {
write(line); // OK, `write` takes its parameters as scope
// (assuming the widespread usage of scope throughtout
Phobos)
}
Line[] lines;
foreach(line; byline) {
lines ~= line;
// ERROR: `line` has type scope!(const byline)(Line), not
Line
}
// let's try to work around it:
scope!(const byline)(Line)[] clines;
foreach(line; byline) { // ERROR: `byline` is const
clines ~= line;
}
// => nope, won't work
// another example, to show how it works:
auto tmp = byline.front; // OK
// `byline` is const as long as `tmp` exists
write(byline.front); // OK, `front` is const
byline.popFront(); // ERROR: `byline` is const
}
Describing what happens here: As long as any variable (or
temporary) with the type `scope!(const byline)` exists, `byline`
itself will be treated as const. "Exists" in this case only
referes to lexical scope: A variable is said to "exist" from the
point it is declared, to the end of the scope it's declared in.
Loops, gotos, and exceptions don't have an effect. This means
that it can be easily checked by the compiler, without it having
to perform complicated control flow analysis.
I also thought about allowing `scope!return` for functions, to
specify that it a value will not outlive the value returned from
the function, but I'm not sure whether there is an actual use
case, and the semantics are not clear.
An open question is whether there needs to be an explicit
designation of GC'd values (for example by `scope!static` or
`scope!GC`), to say that a given values lives as long as it's
needed (or "forever").
Specifying an owner in the type also integrates naturally with
allocators. Assuming an allocator releases all of it's memory to
operating system when it is destroyed, there needs to be a
guarantee that none of its contents is referenced anymore at this
point. This can be achieved by returning a borrowed reference:
struct MyAllocator {
scope!this(T) alloc(T)() if(T == class) {
// ...
}
}
Note that this does not preclude the allocator from doing garbage
collection while it exists; in this manner, `scope!GC` might just
be an application of this pattern instead of a special syntax.
Now, for the problems:
Obviously, there is quite a bit of complexity involved. I can
imagine that inferring the scope for templates (which is
essential, just as for const and the other type modifiers) can be
complicated.
On the upside, at least it requires no control or data flow
analysis. It's also a purely additive change: If implemented
right, no currently working code will break.
Then I encountered the following problem, and there are several
different variations of it:
struct S {
int* p;
void releaseBuffer() scope {
// `scope` in the signature applies to `this`
free(this.p);
this.p = null;
}
}
int bar(scope ref S a, scope int* b) {
a.releaseBuffer();
return *b; // use after free
}
S s;
bar(s, s.p);
The root cause of the problem here is the call to `free()`. I
_believe_ the solution is that `free()` (and equivalent functions
of allocators as well as `delete`) must not accept scope
parameters. More realistic candidates for such situations are
destructors in combination with move semantics. Therefore,
`~this()` needs to be marked as scope, too, for it to be callable
on a borrowed object. If a scope object has a non-scope
destructor, but no scope one, and is going to be destroyed, this
needs to be a compile error. (Rust avoids that problem by making
any object const while there are borrowed references, but this
requires its complex borrow checker, which we should avoid for D.)
I also have a few ideas about owned types and move semantics, but
this is mostly independent from borrowing (although, of course,
it integrates nicely with it). So, that's it, for now. Sorry for
the long text. Thoughts?
More information about the Digitalmars-d
mailing list