Proposal for design of 'scope' (Was: Re: Opportunities for D)

Fri Jul 11 14:02:27 PDT 2014

On Thu, Jul 10, 2014 at 08:10:36PM +0000, via Digitalmars-d wrote:
> I've been working on a proposal for ownership and borrowing since some
> time, and I seem to have come to a very similar result as you have. It
> is not really ready, because I keep discovering weaknesses, and can
> only work on it in my free time, but I'm glad this topic is finally
> addressed. I'll write about what I have now:
> 
> First of all, as you've already stated, scope needs to be a type
> modifier (currently it's a storage class, I think). This has
> consequences for the syntax of any parameters it takes, because for
> type modifiers there need to be type constructors. This means, the
> `scope(...)` syntax is out. I suggest to use template instantiation
> syntax instead: `scope!(...)`, which can be freely combined with the
> type constructor syntax: `scope!lifetime(MyClass)`.
> 
> Explicit lifetimes are indeed necessary, but dedicated identifiers for
> them are not. Instead, it can directly refer to symbol of the "owner".
> Example:
> 
>     int[100] buffer;
>     scope!buffer(int[]) slice;

Hmm. Seems that you're addressing a somewhat wider scope than what I had
in mind. I was thinking mainly of 'scope' as "does not escape the body
of this block", but you're talking about a more general case of being
able to specify explicit lifetimes.

[...]
> A problem that has been discussed in a few places is safely returning
> a slice or a reference to an input parameter. This can be solved
> nicely:
> 
>     scope!haystack(string) findSubstring(
>         scope string haystack,
>         scope string needle
>     );
> 
> Inside `findSubstring`, the compiler can make sure that no references
> to `haystack` or `needle` can be escape (an unqualified `scope` can be
> used here, no need to specify an "owner"), but it will allow returning
> a slice from it, because the signature says: "The return value will
> not live longer than the parameter `haystack`."

This does seem to be quite a compelling argument for explicit scopes. It
does make it more complex to implement, though.

[...]
> An interesting application is the old `byLine` problem, where the
> function keeps an internal buffer which is reused for every line that
> is read, but a slice into it is returned. When a user naively stores
> these slices in an array, she will find that all of them have the same
> content, because they point to the same buffer. See how this is
> avoided with `scope!(const ...)`:

This seems to be something else now. I'll have to think about this a bit
more, but my preliminary thought is that this adds yet another level of
complexity to 'scope', which is not necessarily a bad thing, but we
might want to start out with something simpler first.

[...]
> An open question is whether there needs to be an explicit designation
> of GC'd values (for example by `scope!static` or `scope!GC`), to say
> that a given values lives as long as it's needed (or "forever").

Shouldn't unqualified values already serve this purpose?

[...]
> Now, for the problems:
> 
> Obviously, there is quite a bit of complexity involved. I can imagine
> that inferring the scope for templates (which is essential, just as
> for const and the other type modifiers) can be complicated.

I'm thinking of aiming for a design where the compiler can infer all
lifetimes automatically, and the user doesn't have to. I'm not sure if
this is possible, but based on what Walter said, it would be best if we
infer as much as possible, since users are lazy and are unlikely to be
thrilled at the idea of having to write additional annotations on their
types.

My original proposal was aimed at this, that's why I didn't put in
explicit lifetimes. I was hoping to find a way to define things such
that the lifetime is unambiguous from the context in which 'scope' is
used, so that users don't ever have to write anything more than that.
This also makes the compiler's life easier, since we don't have to keep
track of who owns what, and can just compute the lifetime from the
surrounding context. This may require sacrificing some precision in
lifetimes, but if it helps simplify things while still giving adequate
functionality, I think it's a good compromise.

[...]
> I also have a few ideas about owned types and move semantics, but this
> is mostly independent from borrowing (although, of course, it
> integrates nicely with it). So, that's it, for now. Sorry for the long
> text. Thoughts?

It seems that you're the full borrowed reference/pointer problem, which
is something necessary. But I was thinking more in terms of the baseline
functionality -- what is the simplest design for 'scope' that still
gives useful semantics that covers most of the cases? I know there are
some tricky corner cases, but I'm wondering if we can somehow find an
easy solution for the easy parts (presumably the more common parts),
while still allowing for a way to deal with the hard parts.

At least for now, I'm thinking in the direction of finding something
with simple semantics that, at the same time, produces complex
(interesting) effects when composed, that we can use to solve the
borrowed pointer problem.

T

-- 
Computers are like a jungle: they have monitor lizards, rams, mice, c-moss, binary trees... and bugs.