Generic const - a non-functional view

Steven Schveighoffer schveiguy at yahoo.com
Tue Jun 24 10:30:53 PDT 2008


Before I begin, I want to say that a lot of this has come from reading other 
ideas that other people have had.  I know for instance that Janice has 
proposed something along these same lines using templates.  I think this 
proposal has not been posted quite exactly the same by anyone else, and the 
underlying themes are different than other proposals I or others have had 
regarding const and logical const.  In any case, credit should go to all 
those ideas posted that led me to this proposal.

D2's const system is centered around functional programming.  Without it, 
the grand view of Walter and Andrei to have a compiler that optimizes for 
multi-core systems with little effort from developers cannot be realized. 
However, there is another reason to have const, which Walter at least 
acknowledges.  That reason is for specifying contracts for functions.  That 
goal is partially realized, but since the const system is more focused on 
FP, it does not completely provide ways to specify all possible functionally 
sound constructs.  This post is aimed at outlining a way that D2's const 
system could be extended to provide const specification for both pure 
functions and function contracts, in as easy to use fashion as possible.

We begin by thinking about defining const.  Const is an attribute applied to 
a type.  The attribute doesn't change the data that is described by the 
type, it only changes the type, and rules for accessing the data.  We have 
four rules for const:

1. A variable that is typed as const or invariant cannot be changed.

2. A variable that is typed as const or invariant that has references to 
data cannot change that data through the references. (const is transitive)

3. A variable that is typed as invariant is guaranteed that all data that 
can be referenced through the variable will not change by any other 
references to the same data.

4. One can implicitly cast a data type to the same data type with different 
constancy if the newly modified type provides at least all the same 
protections to the same data that the original type provided.

In the current D2 scheme, we have the following two rules which implement 
rule 4:
A. mutable or invariant data can implicitly be casted to const.
B. No other implicit casts are possible.

But let's think about how constancy is applied to an aggregate data type. 
The current granularity of applying const is to an entire aggregate.  One 
cannot selectively apply const to pieces of an aggregate.  However, if one 
could do so, one does not violate any of the rules.  Here is an example, 
keep in mind that X and X' are meant to be the same data type but with a 
different constancy applied:

class X
{
   int m;
   int m2;
}

class X'
{
   int m;
   const int m2;
}

Since X and X' are the same data type, but with different constancy, one can 
implicitly cast a variable of type X to a variable of type X' without 
violating rules 1-4.

So the question now becomes, how does one go about specifying a syntax for 
providing this functionality such that one has as much expressiveness as 
possible without violating the rules, and without making the syntax or 
specification cumbersome?

First, we specify a way to tag members of an aggregate such that they can be 
referred to in groups.  This one piece of grammar is the only crufty part of 
the solution, but could be easily alleviated by introducing a new keyword. 
In any case, using existing keywords, you create a 'tag group', which is 
scoped like any other symbol.  Using existing syntax, this looks like:

const alias A;

If this appears in a module or class, or even function, it only exists 
within that scope, just like a variable symbol.  How does one use such a 
tag?

class X
{
   int m;
   A int m2;
}

Now, m2 is in the tag group identified as A.  Now we need a syntax in order 
to affect just the constancy of the A tag group of X.  If we think of 
casting only the A group to const as parameterizing const's scope, it's 
similar to a parameter to a template.  So let's use the same syntax:

const!(A)(X) x;

So const!(A) means 'Apply const only to the group A', and X is the type to 
apply this rule to.  So now, x.m is mutable, x.m2 is const.  Now, we can 
cast a plain X to a const!(A)(X) implicitly, because this does not break any 
rules 1-4 above.  However, we cannot implicitly cast a const!(A)(X) back to 
an X because it would break rule 4.

Let us also define a special compiler-builtin tag group called __const_ALL. 
This tag group is special in that all members of any aggregate, regardless 
of the tag that is applied to them, also are tagged with __const_ALL.  This 
special group is used when applying const or invariant without 
parameterization, this group is implied.  i.e., const(X) is equivalent to 
const!(__const_ALL)(X), and invariant(X) is equivalent to 
invariant!(__const_ALL)(X).  Using class X above, a const!(__const_ALL)(X) 
has all const members, even though m2 is tagged with A.

This allows us to implement a logical const scheme, and also allow data to 
be transitively invariant in order to be used in pure functions.  We get the 
best of all worlds.

However, there are some more problems to solve.

First, it should be explicitly stated that a parameterized const is just as 
powerful as an unparameterized const with regards to pointers and arrays. 
i.e., you can parameterize only what a pointer points to, or only the 
elements of an array:

const!(A)(X) *x;
const!(A)(X)[] x2;

x and x2 are mutable, but the data pointed to is typed as const!(A)(X).

It is very unlikely that tag groups will have short names like 'A', they 
will be something more descriptive.  Utilizing these tag group names 
everywhere would be very cumbersome:

const alias logic_const;

const!(logic_const)(X) x2 = x;

So we allow aliases for these parameterized type modifiers:

const alias logic_const;

alias const!(logic_const) lconst;

lconst(X) x2 = x;

This syntax is more readable, easier to use and understand.

The next problem is being able to tag members as NOT being affected by a 
parameterized const statement.  For example, C++, through the mutable 
keyword, allows you to define members that are not casted to const when the 
aggregate is casted to const.  This is like a 'reverse' tag group.  Without 
something similar, one must tag all the members that SHOULD be casted to 
const when logical const is applied.  So we borrow the bitwise not operator 
for this purpose:

const alias mutable;

alias const!(~mutable) lconst;

class X
{
   mutable int m;
   int m2;
}

auto x = new X;
lconst(X) lx = x; // ok, not weakening constancy
lx.m = 2; // ok
lx.m2 = 3; // error, m2 was not tagged by mutable, so was cast to const

Next, we try to reduce the code for parameterizing const on multiple 
parameters.  For example, if you have two tag groups A and B in an 
aggregate, and you want to cast both of them to const, it would be:

const!(A)(const!(B)(X)) x;

Let's allow this kind of multiple specification to be shorthanded as:

const!(A, B)(X);

Next we tackle the chaining problem.  Let's say you have a linked list, and 
you want to specify a sorting function for that linked list.  The sorting 
function will not change any values pointed to by that linked list, but can 
change the order of the nodes in the linked list.  How does one specify this 
contract?  Using the rules we have so far:

const alias linkdata;

class Link(T)
{
    Link!(T) next;
    linkdata T data;
}

If we pass the head of the link list as const!(linkdata)(Link!(T)) to the 
function, what happens?  Upon traversing to the next link, the function is 
able to modify the data in the next link.  We need a way to specify a 
recursive linkdata constancy.  In order to do this, we need a second tag 
group, and recursive definition:

const alias linkdata;
const alias link;

class Link(T)
{
   link Link!(T) next;
   linkdata T data;
}

alias const!(linkdata)(dataconst(link)) dataconst;

Note that we use dataconst inside the definition for dataconst, just like we 
could have a pointer to a type within the definition of the type.  Note also 
that we need to define this as an alias so we can name the parameterized 
const definition in order to use it within the definition.  Therefore, you 
cannot specify this type of constancy without an alias statement.

dataconst(Link!(T)) x;

Now, x.next is of type dataconst(Link!(T)), and x.data is of type const(T).

The one problem we cannot solve is head-const.  For example, if you wanted 
to allow someone to traverse a link list and not change the structure of the 
links, but allow them to change the data, it is not possible without 
breaking rule 2, as you cannot have a constant pointer that points to 
mutable data.  However, this kind of thing could be solved with generic 
tagging (not just for const) combined with function overloading, or by using 
interfaces that enforce these requirements.

There is one final bonus problem that can be solved using these ideas.  If 
you think of the data that a class reference points to being of type 
<classname>_data, then you can think of a class reference as a pointer:

class X {};

alias X_data * X; // implicitly defined

Let's define another builtin tag group called __class_data_ref;  We can now 
redefine a class reference to X to be:

alias __class_data_ref(X_data) * X;

Now, we can define an alias to make x1 tail-const:

alias const!(__class_data_ref) tconst;

tconst(X) x1;

Now, x1 is rebindable, but the data x1 points to is const.

So in conclusion, I think this extension has the following benefits:

1. The existing ideas for pure functions requiring invariant are not broken
2. Logical const is possible for people who want const-contract 
specification
3. Logical const can be recursive, allowing for contract specifications that 
are not available in any other language AFAIK.
4. The syntax is powerful enough to allow specification of any const 
casting, but easy enough to be usable and readable.

I understand that this proposal might be difficult to implement in the 
compiler, and it might just be not worth the effort.  There are a lot of 
problems that can be solved by not using const or using interfaces, instead 
of these constructs.  However, this proposal represents what I think is the 
most lenient and flexible proposal that is also const-correct from those I 
have read, and allows for almost any const contract specification (excluding 
head-const).  Whether it gets implemented or not, it was interesting to 
think about how to solve all these problems, and I enjoyed the challenge.

I would really like to hear people's opinions, especially Walter's.

-Steve 





More information about the Digitalmars-d mailing list