Auto-invariance (WAS: const )

Tue Apr 1 09:06:27 PDT 2008

== Quote from Sean Kelly (sean at invisibleduck.org)'s article
> == Quote from Benji Smith (benji at benjismith.net)'s article
> > Sean Kelly wrote:
> > > I think the need for const vs. invariant in D 2.0 is that the compiler isn't
> > > smart enough to know when the referenced data should actually never
> > > change, and because a bunch of interesting optimizations are available
> > > when the data really doesn't ever change, we have a keyword to hint to
> > > the compiler that this is true.
> > If the compiler can't actually detect mutation of const values, then how
> > will it enforce const correctness? Is the 'const' keyword just a fancy
> > compiler-enforced form of documentation?
> The compiler can locally, but what if it doesn't have access to the entire
> program source when compiling a module?  For example, let's say I create
> a library function like so:
>     void doSomething( const(char)[] buf, void delegate() dg )
>     {
>         writefln( buf ); // A
>         dg();
>         writefln( buf ); // B
>     }
> If the compiler knows that the data referenced by buf will not change for
> the entire length of the call to doSomething() then it can cache information
> about buf in registers, etc.  However, doSomething() contains an opaque
> call to external code, which could in theory modify the contents of buf.
> Consider:
>     char[] inp = "hello world";
>     doSomething( inp, () { inp[0] = 'j'; } );
> So because there is an opaque call inside doSomething(), the compiler must
> assume that the contents of buf may change across that call, and therefore
> may only assume that buf is invariant from the beginning of doSomething()
> up to the call to dg(), then from the call of dg() to the end of doSomething().
> What I don't like about 'invariant' is that while the label is in the correct or
> at least the easiest place for the compiler, its location means code duplication
> for the programmer if she wants to have her function work for both constant
> and non-constant data.  This is fine for passing shared data between threads:
> class MyClass
> {
>     private string m_s;
>     void setString( string s ) { m_s = s; }
>     void setString( char[] s ) { m_s = s.idup; }
> }
> as it avoids the need to perform a copy, or more generally, to acquire a mutex.
> However, I think the far more common case will be that library programmers
> will want to create algorithms that work on any kind of string and have the
> compiler simply optimize the code ideally for each situation:
> size_t find(T)( in T[] buf, T pat )
> {
>     foreach( i, e; buf )
>         if( e == pat )
>             return i;
>     return buf.length;
> }
> Here, the need to create separate, identical implementations for the same
> algorithm that vary only by the constancy of buf simply so the compiler can
> optimize differently for each aspect is horrible.  I would much rather have
> the compiler invisibly generate the different permutations for me and do
> something fancy with the name mangling to sort out all out invisibly.  So:
> "abc".find( 'b' ); // calls the "invariant(char)[]" permutation
> char[] buf; buf.find( 'b' ); // calls the "const(char)[]" permutation
> Of course, this means that the compiler will silently be generating more
> code but so what?  A decent linker could throw out the unused routines
> anyway.  I haven't given this idea much thought, but on the surface it seems
> like it could do away with the need for 'invariant' almost entirely.
> Sean

No one liked this idea, huh? :-)  One thought for reducing the number of
permutations was to only permute on "in" parameters.  Explicitly specified
"const" or "invariant" parameters are left as-is.  Another possible reduction
would be to only do this for template functions, though this would be
somewhat of an odd special case.

Sean