[Issue 1961] New: Allow "scoped const" contracts

Mon Mar 31 10:22:22 PDT 2008

http://d.puremagic.com/issues/show_bug.cgi?id=1961

           Summary: Allow "scoped const" contracts
           Product: D
           Version: 2.013
          Platform: PC
        OS/Version: All
            Status: NEW
          Severity: enhancement
          Priority: P2
         Component: DMD
        AssignedTo: bugzilla at digitalmars.com
        ReportedBy: schveiguy at yahoo.com

The current const scheme does not allow scoped const.  A scoped const contract
means a function will not change an input parameter or anything referred to
through that input parameter, but the function does not restrict the contract
that the calling function has with the parameter.

First I will describe the problem I am trying to solve, then describe my
solution.

Two examples come to mind, C's strstr function, and C++'s min function, written
in D form without const here:

char[] strstr(char[] source, char[] pattern);

T min(T)(T a, T b);

First strstr.  Source is an input type, and should not be changed in the
function.  But the return value is a substring of source, and so the constancy
must be consistent.  In the current const scheme, this would be implemented as:

const(char)[] strstr(const(char)[] source, const(char)[] pattern)

Now, however, strstr cannot be used in an iterative pattern on a mutable or
invariant string:

char[] x = "hello".dup;

x = strstr(x, "l"); // error

So essentially, strstr is modifying the contract that the calling function has
with x, namely that once strstr is called on x, any result must be const.  This
sort of contract modification is not necessary in the const scheme, as one is
perfectly allowed to change x.

With an invariant version, you cannot re-assign to the argument also:

auto y = "hello";
y = strstr(y, "l"); // error

Again, we're modifying the contract.

If we solve the problem using templates, then we have:

T[] strstr(T)(T[] source, const(T)[] pattern);

Note, however, that this does not achieve what we want.  We are not
guaranteeing that strstr does not modify source in the case of a mutable
argument passed into the function.  There could be a static-if clause in the
function that allows strstr to compile and modify source if it is mutable.  A
user of the function cannot rely on strstr not changing source by looking at
the method signature, he must inspect the implementation.

Now let us look at the second example, min:

T min(T)(T a, T b);

Like the template version of strstr, this does not guarantee the constancy of
it's arguments if T is mutable.

If we add const such that min does not modify it's arguments it becomes:

const(T) min(T)(const(T) a, const(T) b);

And now, min is no longer as useful, because it cannot be used in the most
common case:

t = min(t, maxvalue); // clamp t to be <= maxvalue

if t is mutable or invariant.

What we need is a way to say that a parameter is constant only for the scope of
the function, and then reverts to its original constancy when returning.

Here is my proposal (with help from Janice) on how to solve this problem:

Let the 'inout' keyword describe this contract.  The inout keyword is
completely obsoleted by ref, and so can be used for this purpose.  Used like
const, the inout keyword used on a parameter means that that parameter is
essentially const  for the scope of the function.  When used on the return
value, it means that the return value is casted back to the constancy factor of
the inout parameters at the call-site.  The constancy factor of the parameters
follows the following rules:

  - If all inout parameters are homogeneous, meaning they are all const, all
invariant, all mutable, or all inout (think recursive functions), then the
constancy factor is equivalent to the constancy of the parameters.
  - If any inout parameters differ in constancy, then the constancy factor is
equivalent to const.

The following rules are in force during the function scope:

 - inout(T) can be implicitly cast to const(T)
 - inout(T) is transitively inout, meaning access to any member of T is also
inout.
 - The only thing implicitly castable to inout(T) is T.init (think return
null).

At the call-site, the compiler makes the decision as to what the constancy
factor is for the call (see rules above).  It then follows the rules of const
as if the constancy factor was substituted for inout.

Example, strstr:

inout(char)[] strstr(inout(char)[] source, const(char)[] pattern)
{
   /*
    * this would be an error, as const(char)[] is not implicitly castable to 
    * inout(char)[]
    */
   // return pattern;

   for(int i = 0; i + pattern.length <= source.length; i++)
   {
      inout(char)[] tmp = source[i..pattern.length]; // ok
      if(tmp == pattern) // ok, tmp implicitly casts to const(char)[]
         return source[i..$]; // implicitly casts back to call-site source
   }
   return source[$..$]; // to be consistent with strstr.
}

void f()
{
   auto a = "hello";
   a = strstr(a, "llo"); // cf (constancy factor) == invariant
   // char[] b = strstr(a, "llo"); // error, cannot cast invariant to mutable
   char[] b = "hello".dup;
   b = strstr(b, "llo"); // cf == mutable (note that "llo" doesn't play a role
         // because that parameter is not inout)
   const(char)[] c = strstr(b, "llo"); // cf = mutable, ok because mutable
         // implicitly casts to const
   c = strstr(a, "llo"); // cf = invariant, ok invariant casts to const
}

Now, let's see how min works:

inout(T) min(T)(inout(T) a, inout(T) b)
{
  return a < b ? a : b;
}

void f()
{
   invariant(char)[] i = "hello";
   const(char)[] c = "there";
   char[] m = "Walter".dup;

   //i = min(i, c); // error, since i and c vary in constancy, the result
           // is const, and you cannot implicitly cast const to invariant.

   c = min(i, c); // ok, cf == const, because not homogeneous
   c = min(m, c); // ok, cf == const
   c = min(m, i); // ok, cf == const
   i = min(i, "blah"); // ok, cf == invariant, homogeneous
   //m = min(m, c); // error, cf == const because not homogeneous.  
   //m = min(m, "blah"); // error, cf == const
   m = min(m, "blah".dup); // ok
}

The implementation in the compiler of this produces one version of the function
for all permutations of constancy.  This makes this solution superior over the
template solution (in addition to enforcing const for all versions) in that
there is only one function versus 3^x, where x is the number of inout
parameters.

Optionally, the compiler may generate multiple functions if certain versions of
constancy can be built optimized.  for example, if all the inout parameters are
invariant, the resulting function may be optimized differently.

Also optionally, the overload resolution could allow specializations.  For
example, if invariant allows you to avoid locking mutexes, then you may want to
write a special version of the method where everything is invariant.  In this
context, the overload resolution could match against the inout version last.

Note that the choice of 'inout' as a keyword should not degrade from the
semantic meaning of this enhancement.  If 'inout' is no good as a keyword, then
something else is fine with me.

--