Generic const - a non-functional view
Steven Schveighoffer
schveiguy at yahoo.com
Tue Jun 24 10:30:53 PDT 2008
Before I begin, I want to say that a lot of this has come from reading other
ideas that other people have had. I know for instance that Janice has
proposed something along these same lines using templates. I think this
proposal has not been posted quite exactly the same by anyone else, and the
underlying themes are different than other proposals I or others have had
regarding const and logical const. In any case, credit should go to all
those ideas posted that led me to this proposal.
D2's const system is centered around functional programming. Without it,
the grand view of Walter and Andrei to have a compiler that optimizes for
multi-core systems with little effort from developers cannot be realized.
However, there is another reason to have const, which Walter at least
acknowledges. That reason is for specifying contracts for functions. That
goal is partially realized, but since the const system is more focused on
FP, it does not completely provide ways to specify all possible functionally
sound constructs. This post is aimed at outlining a way that D2's const
system could be extended to provide const specification for both pure
functions and function contracts, in as easy to use fashion as possible.
We begin by thinking about defining const. Const is an attribute applied to
a type. The attribute doesn't change the data that is described by the
type, it only changes the type, and rules for accessing the data. We have
four rules for const:
1. A variable that is typed as const or invariant cannot be changed.
2. A variable that is typed as const or invariant that has references to
data cannot change that data through the references. (const is transitive)
3. A variable that is typed as invariant is guaranteed that all data that
can be referenced through the variable will not change by any other
references to the same data.
4. One can implicitly cast a data type to the same data type with different
constancy if the newly modified type provides at least all the same
protections to the same data that the original type provided.
In the current D2 scheme, we have the following two rules which implement
rule 4:
A. mutable or invariant data can implicitly be casted to const.
B. No other implicit casts are possible.
But let's think about how constancy is applied to an aggregate data type.
The current granularity of applying const is to an entire aggregate. One
cannot selectively apply const to pieces of an aggregate. However, if one
could do so, one does not violate any of the rules. Here is an example,
keep in mind that X and X' are meant to be the same data type but with a
different constancy applied:
class X
{
int m;
int m2;
}
class X'
{
int m;
const int m2;
}
Since X and X' are the same data type, but with different constancy, one can
implicitly cast a variable of type X to a variable of type X' without
violating rules 1-4.
So the question now becomes, how does one go about specifying a syntax for
providing this functionality such that one has as much expressiveness as
possible without violating the rules, and without making the syntax or
specification cumbersome?
First, we specify a way to tag members of an aggregate such that they can be
referred to in groups. This one piece of grammar is the only crufty part of
the solution, but could be easily alleviated by introducing a new keyword.
In any case, using existing keywords, you create a 'tag group', which is
scoped like any other symbol. Using existing syntax, this looks like:
const alias A;
If this appears in a module or class, or even function, it only exists
within that scope, just like a variable symbol. How does one use such a
tag?
class X
{
int m;
A int m2;
}
Now, m2 is in the tag group identified as A. Now we need a syntax in order
to affect just the constancy of the A tag group of X. If we think of
casting only the A group to const as parameterizing const's scope, it's
similar to a parameter to a template. So let's use the same syntax:
const!(A)(X) x;
So const!(A) means 'Apply const only to the group A', and X is the type to
apply this rule to. So now, x.m is mutable, x.m2 is const. Now, we can
cast a plain X to a const!(A)(X) implicitly, because this does not break any
rules 1-4 above. However, we cannot implicitly cast a const!(A)(X) back to
an X because it would break rule 4.
Let us also define a special compiler-builtin tag group called __const_ALL.
This tag group is special in that all members of any aggregate, regardless
of the tag that is applied to them, also are tagged with __const_ALL. This
special group is used when applying const or invariant without
parameterization, this group is implied. i.e., const(X) is equivalent to
const!(__const_ALL)(X), and invariant(X) is equivalent to
invariant!(__const_ALL)(X). Using class X above, a const!(__const_ALL)(X)
has all const members, even though m2 is tagged with A.
This allows us to implement a logical const scheme, and also allow data to
be transitively invariant in order to be used in pure functions. We get the
best of all worlds.
However, there are some more problems to solve.
First, it should be explicitly stated that a parameterized const is just as
powerful as an unparameterized const with regards to pointers and arrays.
i.e., you can parameterize only what a pointer points to, or only the
elements of an array:
const!(A)(X) *x;
const!(A)(X)[] x2;
x and x2 are mutable, but the data pointed to is typed as const!(A)(X).
It is very unlikely that tag groups will have short names like 'A', they
will be something more descriptive. Utilizing these tag group names
everywhere would be very cumbersome:
const alias logic_const;
const!(logic_const)(X) x2 = x;
So we allow aliases for these parameterized type modifiers:
const alias logic_const;
alias const!(logic_const) lconst;
lconst(X) x2 = x;
This syntax is more readable, easier to use and understand.
The next problem is being able to tag members as NOT being affected by a
parameterized const statement. For example, C++, through the mutable
keyword, allows you to define members that are not casted to const when the
aggregate is casted to const. This is like a 'reverse' tag group. Without
something similar, one must tag all the members that SHOULD be casted to
const when logical const is applied. So we borrow the bitwise not operator
for this purpose:
const alias mutable;
alias const!(~mutable) lconst;
class X
{
mutable int m;
int m2;
}
auto x = new X;
lconst(X) lx = x; // ok, not weakening constancy
lx.m = 2; // ok
lx.m2 = 3; // error, m2 was not tagged by mutable, so was cast to const
Next, we try to reduce the code for parameterizing const on multiple
parameters. For example, if you have two tag groups A and B in an
aggregate, and you want to cast both of them to const, it would be:
const!(A)(const!(B)(X)) x;
Let's allow this kind of multiple specification to be shorthanded as:
const!(A, B)(X);
Next we tackle the chaining problem. Let's say you have a linked list, and
you want to specify a sorting function for that linked list. The sorting
function will not change any values pointed to by that linked list, but can
change the order of the nodes in the linked list. How does one specify this
contract? Using the rules we have so far:
const alias linkdata;
class Link(T)
{
Link!(T) next;
linkdata T data;
}
If we pass the head of the link list as const!(linkdata)(Link!(T)) to the
function, what happens? Upon traversing to the next link, the function is
able to modify the data in the next link. We need a way to specify a
recursive linkdata constancy. In order to do this, we need a second tag
group, and recursive definition:
const alias linkdata;
const alias link;
class Link(T)
{
link Link!(T) next;
linkdata T data;
}
alias const!(linkdata)(dataconst(link)) dataconst;
Note that we use dataconst inside the definition for dataconst, just like we
could have a pointer to a type within the definition of the type. Note also
that we need to define this as an alias so we can name the parameterized
const definition in order to use it within the definition. Therefore, you
cannot specify this type of constancy without an alias statement.
dataconst(Link!(T)) x;
Now, x.next is of type dataconst(Link!(T)), and x.data is of type const(T).
The one problem we cannot solve is head-const. For example, if you wanted
to allow someone to traverse a link list and not change the structure of the
links, but allow them to change the data, it is not possible without
breaking rule 2, as you cannot have a constant pointer that points to
mutable data. However, this kind of thing could be solved with generic
tagging (not just for const) combined with function overloading, or by using
interfaces that enforce these requirements.
There is one final bonus problem that can be solved using these ideas. If
you think of the data that a class reference points to being of type
<classname>_data, then you can think of a class reference as a pointer:
class X {};
alias X_data * X; // implicitly defined
Let's define another builtin tag group called __class_data_ref; We can now
redefine a class reference to X to be:
alias __class_data_ref(X_data) * X;
Now, we can define an alias to make x1 tail-const:
alias const!(__class_data_ref) tconst;
tconst(X) x1;
Now, x1 is rebindable, but the data x1 points to is const.
So in conclusion, I think this extension has the following benefits:
1. The existing ideas for pure functions requiring invariant are not broken
2. Logical const is possible for people who want const-contract
specification
3. Logical const can be recursive, allowing for contract specifications that
are not available in any other language AFAIK.
4. The syntax is powerful enough to allow specification of any const
casting, but easy enough to be usable and readable.
I understand that this proposal might be difficult to implement in the
compiler, and it might just be not worth the effort. There are a lot of
problems that can be solved by not using const or using interfaces, instead
of these constructs. However, this proposal represents what I think is the
most lenient and flexible proposal that is also const-correct from those I
have read, and allows for almost any const contract specification (excluding
head-const). Whether it gets implemented or not, it was interesting to
think about how to solve all these problems, and I enjoyed the challenge.
I would really like to hear people's opinions, especially Walter's.
-Steve
More information about the Digitalmars-d
mailing list