Inherited const when you need to mutate

Tue Jul 10 12:00:59 PDT 2012

On Tue, Jul 10, 2012 at 02:04:04PM -0400, Jonathan M Davis wrote:
> On Tuesday, July 10, 2012 10:13:57 H. S. Teoh wrote:
[...]
> > Y'know, this brings up an interesting question. Do methods like
> > toString _need_ to be const? That is, _physical_ const? Or are we
> > unconsciously conflating physical const with logical const here?
> > 
> > Yes, certain runtime operations need to be able to work with const
> > methods, but I wonder if those required const methods really belong
> > to a core set of more primitive operations that guarantee physical
> > const, and perhaps shouldn't be conflated with logical operations
> > like "convert this object to a string representation", which _may_
> > require caching, etc.?
> 
> For a member function to be called on a const object, that function
> must be const. Whether it's logical const or physical const is
> irrelevant as far as that goes. As such, opEquals, opCmp, toHash, and
> toString all need to be const on Object, or it will be impossible for
> const Objects to work properly. 

Yes, they have to be const, but by doing so, we are implicitly forcing
physical constness on all of them (because that's the only const D
knows). The question is whether this is the way we should go.

> Ideally, we'd alsa have a way to make it possible to have objects
> which aren't const and can't be const use those functions (which given
> physical constness obviously requires a separate function - be it an
> overload or an entirely separate function), but without those
> functions being const, const objects don't work.

Which is why I suggested in another post to have both const and
non-const variants of the methods. But that isn't a good solution
either, because what if some objects simply can't have const versions of
those methods? Plus, it leads to needless code duplication -- even if
you implement a non-const toString method, you still need to also
implement a const toString method because people will expect to be able
to call the const method.

> Of greater debate is whether opEquals, opCmp, toString, and toHash on
> structs need to be const. Aside from druntime functions wanting to be
> to take their arguments as const, I don't see really see that as being
> necessary (though Walter wanst to require that they all be @safe const
> pure nothrow regardless of whether they're classes or structs), and
> since druntime probably has to templatize the functions which would
> take const anyway (and it can use inout or templatize them anyway if
> it doesn't need to), I wouldn't expect that much in druntime would
> require const. It should work with it, but it shouldn't need it. So, I
> don't think that structs really need to have those functions be const.
> 
> Classes is where it's a big problem - because of inheritance.
[...]

I think hidden somewhere in this is an unconscious conflation of
physical const with logical const.

Take toHash, for example. Why does it need to be const? The naïve
assumption would be, well, we're taking the hash of some field values,
and that shouldn't require changing anything, so yeah, it's a const
method. However, that doesn't fully cover all the use cases of toHash.
For one thing, hashes _don't_ need to be based on every single field in
the struct/object. I can easily decide, in my custom toHash function, to
only compute the hash value based on two out of 5 fields in my struct
(perhaps only those two fields matter for whatever I'm using the hash
value for). So I don't care what the value of the other fields are. In
particular, if the hash value is expensive to compute, I want to be able
to cache the computed value in one of the other fields.

So here's a hidden assumption, that toHash must be const -- it must be
logical const, yes, but that is in no way equivalent to physical const.
In this case, I can't use toHash at all, because it doesn't permit
caching, even though its computed value is based only on the unchanged
fields. Or, to take this point further, what I _really_ mean is that if
my struct is:

	struct S {
		string x,y;	// hash computed on these values
		hash_t cache;
		int p,q;	// not used by toHash
	}

then my toHash method really is expecting this struct:

	struct logical_const_S {
		const(string) x,y;
		hash_t cache;
		int p,q;	// these can be const or not, we don't care
	}

AFAIK, D currently doesn't allow implicit conversion from S to
logical_const_S. If it did, and if there was a simple way to express
this in the method signature of toHash, then I bet a lot of the
complaints about const in druntime will go away, because then we'd have
a way of doing caching or whatever it is people feel is indispensible,
*without* breaking D's const system.

Now to bring this to my other point: the conversion S -> logical_const_S
would allow, to some limited extent, a non-leaky object API (and by
leaky I mean breaks encapsulation). Currently, if I declare toHash as a
const method, it means that I guarantee the object won't mutate in that
method, not even mutation that *still retains the same logical value*.
But the user of my class doesn't -- and shouldn't -- care about that. As
long as the public methods of the class do not exhibit any visible
change, then I should have the freedom to mutate whatever I like inside
a logical const method.

For example, say I have this base class:

	class B {
		private string x, y;
		hash_t toHash() {
			// compute hash value based on x and y
		}
		string xGetter() const { return x; }
		string yGetter() const { return y; }
	}

Now say I have a derived class:

	class D : B {
		bool cached = false;
		hash_t hash_cache;

		override hash_t toHash() {
			if (!cached) {
				hash_cache = /* expensive computation */
				cached = true;
			} else {
				return hash_cache;
			}
		}
	}

What I _really_ want to be able to do, is to declare D.toHash() as
taking this class instead:

	class logical_const_D {
		private const(string) x, y;
		bool cached = false;
		hash_t hash_cache;

		...
	}

This class has const versions of the fields inherited from B, but
_mutable_ versions of cached and hash_cache. Such an object can still be
used with B.xGetter and B.yGetter, because as far as _they're_
concerned, the object is still const.

More importantly, if the language allows implicit conversion from D to
logical_const_D, (since mutable x can implicitly convert to const x, and
ditto for y), then the definition of logical_const_D doesn't have to be
public.  Thus, I can declare my class D something like this:

	class D : B {
		private:
			// define logical_const_D here
		public:
			hash_t toHash() logical_const_D { ... }
	}

The end-user doesn't need to know what logical_const_D is; if he has an
object of type D that can implicitly convert to logical_const_D, then he
can use toHash() on it. If he has an immutable(D), then he can't use
toHash() (because immutable(bool) and immutable(hash_t) can't implicitly
convert to bool and hash_t).

This way, we preserve the type system, *and* allow a caching
implementation of toHash, *and* preserve encapsulation (user doesn't
need to know which fields actually get changed by toHash -- that's an
implementation detail).

T

-- 
Being able to learn is a great learning; being able to unlearn is a greater learning.