Proposal for scoped const contracts
Steven Schveighoffer
schveiguy at yahoo.com
Mon Mar 24 09:58:02 PDT 2008
This idea has come from the discussion on the const debacle thread.
It is basically an idea for scoped const. The main goal is so that one can
specify that a function does not modify an argument without affecting the
constness of the input.
The main problem to solve would be that I have a function with an argument
that returns a subset of the argument. The easiest function to help explain
the problem is strchr. Please please do NOT tell me that my design is
fundamentally unsound because you can return a range or pair, and then slice
the original arg based on that pair. There are other examples that cannot
be solved this way, this is just the easiest to explain with. Everyone who
uses C should know about strchr:
char *strchr(char const *source, char const *pattern);
The result of strchr is meant to be a pointer into source where pattern
exists.
Note that this is not even close to const-correct in C, because if you pass
in a const source, the const is inhernetly cast away.
So let's move to the D version, which I'll specify with const to begin with:
const(char)[] strchr(const(char)[] source, const(char)[] pattern);
Note that const(char)[] MUST be the return value, because otherwise we
cannot return a slice into source. So far so good, but now, if I am using
strchr to search for a pattern in a mutable string, and then I want to
MODIFY the original string, I must cast away const, because the return value
is const. OK, so you might say let's add an overload (or templatize
strchr):
char[] strchr(char[] source, const(char)[] pattern);
Which compiles and works, but I cannot specify with the signature that
source will not be modified. Therefore, the compiler is not able to take
advantage of optimizations, and the caller is not guaranteed his source
array will be untouched.
So, how do we specify this? I propose a keyword is used to specify "scoped
const", which basically means, "this variable is const within this function,
but reverts to it's original const-ness when returned", let's call it foo
(as a generic name for now):
foo(char)[] strchr(foo(char)[] source, const(char)[] pattern);
Note that foo only specifies source and not pattern because we are not
returning anything from pattern, so it can be fully const.
What does this mean? foo(char)[] source is not modifiable within strchr,
but is implicitly castable to the type of the argument at the call site. So
if we call strchr with a char[], foo(char)[] is essentially an alias to
const(char)[] while inside strchr, but upon return is implicitly castable
back to char[]. This does not violate any const contracts because the
argument was mutable to begin with. If we call strchr with a const(char)[],
foo(char)[] cannot be implicitly cast to char[] because the call site
version was not mutable, and implicitly removing const would violate const
rules. These rules can easily be checked by the compiler at the call site,
and so the function source does not need to be available.
So why must we have a keyword specification? Because of the expressive
nature of const types, you must be able to match exactly where the const
comes into play. For example, const(char)* is different than const(char*),
and so foo must be just as expressive. And in addition, the type returned
may not be exactly the parameter passed in, but the const-ness must be
upheld.
For example, what if the argument was a class, and the return type was
unrelated:
foo(membertype) getMember(foo(classtype ct)) { foo(membertype) return
ct.member;}
Note that if member is a function, it must also be foo, or else the contract
could be violated.
You should be able to declare intermediate variables of type foo(x):
foo(membertype) = ct.member;
What if there are multiple arguments, and the result may come from any of
them:
foo(T) min(T)(foo(T) val1, foo(T) val2);
what if one calls min with a mutable, and an invariant type? The answer is
that foo should map to the least common denominator. If all of the foo's
are identical (invariant, const, or mutable), then the resulting foo would
be identical. If any of them differ, the resulting foo must be const to
uphold const-correctness.
In any case, val1, and val2 are const for the body of the function.
There are other benefits. For example, to implement a min function that
allows a mutable return for mutable arguments, you must define min as a
template, which can generate up to 6 variations (for all the different
argument const types), but with the foo notation, the function generated is
always identical. The only check for const-correctness is at the call site.
Note that this idea is very similar to Janice's idea of:
K(T) f(const K, T)(K(T) t);
The differences are:
- This idea does not require a different template instantiation for
identical code, and in fact is not a template, so it does not require source
or generate bloat.
- This idea ensures that the argument remains const inside the function
even if the argument at the call site is mutable. It enforces the contract
that the caller is making that the argument will never be modified inside
the function.
------------------- PROPOSAL FOR KEYWORD -------------------
That is my general proposal for scoped const, and as an orthogonal
suggestion, which should by no means take away from my above proposal, I
suggest we use the argument keywords 'in' and 'out' to specify foo:
out(char)[] strchr(in(char)[] source, const(char)[] pattern);
So arguments are implicitly castable to 'in', no matter if they are mutable,
const, or invariant.
'in' types are implicitly castable to 'out' types.
'in' arguments cannot be modified inside the function (i.e. they are
essentially const, but with the additional specification that they can be
cast to 'out').
'out' is an alias for the constness at the call site defined by the
following rules:
- if all of the 'in' parameters are of one constancy, (i.e. all are
mutable, all are invariant, or all are const), then out is defined to be the
same constancy.
- if there are two different constancy values for 'in', then 'out' is
defined to be const.
These type declarations are made at the call site, not inside the function.
The function is compiled the same for all versions of 'in' and 'out'.
And for functions that are members of a class:
in out(T) func() {...} // essentially, in(this)
or
out(T) func() in {...}
Rationale: I think in and out are pretty much defunct keywords in this
context (out replaced by ref, in replaced by const), and so are fair game
for this syntax. They are also very good english descriptions of what I am
trying to do.
-Steve
More information about the Digitalmars-d
mailing list