const or immutable?

Fri Sep 24 17:20:04 UTC 2021

On Wed, Sep 22, 2021 at 01:06:59PM -0700, Ali Çehreli via Digitalmars-d wrote:
> Please excuse this thread here, which I know belongs more to the Learn
> forum but I am interested in the opinions of experts here who do not
> frequent that forum.
> 
> tl;dr Do you use 'const' or 'immutable' by-default for parameters and
> for local data?

Nope.  Probably should, but I don't because the transitivity of
const/immutable makes it very difficult to use correctly in complex data
structures.  I *have* tried this before in various cases, but found that
it usually requires a lot of busywork (oops compiler says can't pass
const here, can't accept const there, OK let's fix this function, oops,
the function it calls also needs const, oops, now it can't accept
mutable, must use inout instead, oops, inout is a pain to get right,
forget this, I give up), and the benefits, at least so far, are minimal
-- it has maybe found 1 or 2 bugs in my code, but only after many hours
of refactoring just to make everything const-correct.  So IMHO, not
worth the effort.

Where I've had most success with const/immutable is with leaf-node
modules implementing simple data structures, i.e., at the bottom of the
dependency chain, where other code depends on it but it doesn't depend
on other code. And perhaps one or two levels above that. But beyond
that, it quickly becomes cumbersome and with benefits not proportional
to the amount of effort required to make it work.

> Long version:
> 
> It was simpler in C++: We would make everything 'const' as much as
> possible.  Unfortunately, that doesn't work in D for at least two
> reasons:
> 
> 1) There is 'immutable' as well

It's very simple. Use const when the data could be either immutable or
mutable (usually this applies to function parameters), immutable
everywhere else.

But since POD types make copies anyway, in practice I find
const/immutable not worth the complexity with PODs. Strings are perhaps
the most notable exception to this IME.

> 2) There is no head-const (i.e. no 'const' pointer to mutable data;
> i.e.  "turtles all the way down")

TBH, I see this as an advantage.  However, there *are* certainly cases
where you really want head-const, but there's no in-language solution
(and Phobos' Rebindable isn't a 100% solution).

> Further complications:
> 
> - Reference semantics versus copy semantics; by type (e.g. slices), by
> the 'ref' keyword, by a member of a struct that has reference
> semanticts (struct is by-copy; but a member may not be), etc.

Yeah, once you get beyond the most trivial data structures, you start
running into problems with const/immutable due to the complicated
interactions with everything else.

> - It is said that 'immutable' is a stronger type of const, which at
> first sounds great because if 'const' is good, 'immutable' should be
> even better, right? Unfortunately, we can't make everything
> 'immutable' because 'const' and 'immutable' have very different
> meanings at least in parameter lists.

I think it's fallacious to try to put const/immutable on a scale of
"better" or "worse".  They serve different purposes: const is to
guarantee the recipient of a reference cannot modify the referent;
immutable is to guarantee the referent itself can never be modified. The
former is like lending your data to somebody that you don't trust -- you
yourself can still touch the data but the recipient is only allowed to
look at it.  The latter is when the data itself cannot ever change, not
even by yourself.

> - Parameters versus local data.
> 
> I am seeking simple guidelines like C++'s "make everything const."

Parameters are where the const/immutable are most differentiated. Local
data -- it depends. If you got it from another function call, it may
already come as const or immutable, so you just have to follow what you
receive (or weaken immutable to const if you wish, though I don't see
the point unless you plan to rebind it to mutable later on).  If it's
pure POD locally-initialized, just use immutable.

> Let's start with what I like as descriptions about parameters:
> 
> 1) 'const' parameter is "welcoming" because it can work with mutable,
> 'const', and 'immutable'. It (sometimes) means "I am not going to
> mutate your data."
>
> 2) 'immutable' parameter is "selective" because it can work only with
> 'immutable'. It means "I require immutable data."

Const is like a 3rd party contract to ensure that they don't damage your
data.  They cannot change it, but someone who has write access (i.e., a
mutable reference) still may.

Immutable is like data made of steel: you cannot change it even if you
wanted to.  I.e., *nobody* has write access to it.

> But it's not that simple because a 'const' parameter may be a copy of
> the argument, in which case, it means "I will not mutate *my* data."
> This is actually weird because we are leaking an implementation detail
> here: Why would the caller care whether we mutate our paramener or
> not?

In this case, I'd use `in` instead: this tells the caller "this
parameter is an input; its value will not change afterwards". For PODs,
it already doesn't change, but `in` reads nicer than `const`. :-D

> // Silly 'const':
> void foo(const int i) {
>   // ...
> }

This reads less silly:

	void foo(in int i) { ... }

:-)

[...]
> Aside: If 'const' is welcoming, why do we type 'string' for string
> parameters when we don't actually *require* immutable:
> 
> // Unenecassary 'immutable' but I do this everywhere.
> void printFirstChar(string s) {
>   write(s.front);
> }
> 
> It should have better been const:
> 
> void printFirstChar(const char[] s) {
>   write(s.front);
> }
> 
> But wait! It works above only because 'front' happened to work there.
> The problem is, 's' is not an input range; and that may matter
> elsewhere:
> 
>   static assert(isInputRange!(typeof(s)));  // Fails. :(

It makes me cringe everytime someone writes const/immutable without
parentheses, because it's AMBIGUOUS, and that's precisely the problem
here.  What you *really* meant to write is const(char)[], but by
omitting the parentheses you accidentally defaulted to const(char[]),
which no longer works as a range.

Also, const(char)[] works as long as you don't need to rebind. But if
you do, e.g., in a parsing function that advances the range based on
what was consumed, you run into trouble:

	// N.B.: ref because on exit, we update input to after the
	// token.
	void consumeOneToken(ref const(char)[] input) {
		...
	}

	string input = "...";
	consumeOneToken(input);	// Oops, compile error

On the surface, this seems like a bug, because input can be rebound
without violating immutable (e.g., input = input[1..$] is valid). But on
closer inspection, this is not always true:

	void evilFunction(ref const(char)[] input) {
		char[] immaMutable;
		input = immaMutable; // muahaha
	}

	string input = "...";
	evilFunction(input); // oops, we just bound mutable to immutable

This is why the compiler does not allow binding string
(immutable(char)[]) to ref const(char)[].

But that also means you don't want to use const(char)[] when ref is
involved. Instead, you want string:

	void consumeOneToken(ref string input) {...} // this works

But then, this also means you can't pass in mutable char[]!  So
ultimately, one solution is, rebind string to const(char)[] in the
*caller*, then pass it to the function:

	void consumeOneToken(ref const(char)[] input) {
		...
	}

	string origInput = "...";
	const(char)[] input = origInput; // ugly ugly
	consumeOneToken(input);	// but at least this works now

> So only the elements of the string should be 'const':
> 
> void printFirstChar(const(char)[] s) {
>   write(s.front);
>   static assert(isInputRange!(typeof(s)));  // Passes.
> }
> 
> (Granted, why is it 'const' anyway? Shouldn't printFirstChar be a
> function template? Yes, it should.)

Why should it be?  Using a template here generates bloat for no good
reason. Using const(char)[] makes it work for either case with just one
function in the executable.

> So, what are your guidelines here?
> 
> More important to me: How do you define your local data that should
> not be mutated?
> 
>   const     c = SomeStruct();
>   immutable i = SomeStruct();
> 
> In this case, 'const' is not "welcoming" nor 'immutable' is
> "selective" because these are not parameters; so, the keywords have a
> different meaning here: With local data, they both mean "do not
> mutate". Is 'immutable' better here because we may pass that data to
> an immutable-requiring function?  Perhaps we should learn from string
> and make it really 'immutable'? But it's not the same because here
> 'immutable' applies to the whole struct whereas 'string's immutability
> is only with its elements. There! Not simple! :)

Ugh. I wish people would stop writing const/immutable without
parentheses. ;-)  Always write it with parentheses, and the problem goes
away: you write const(T[]) when you wish the whole object to be const
(resp. immutable), and you write const(T)[] when you want individual
elements to be immutable but the outer structure to be mutable.

But yeah, this only works at the bottom-most level of abstraction. Once
your type becomes more complex, it quickly devolves into a cascade of
exploding const/immutable complexity, and you start running into lovely
conundrums like how to make const(SomeStruct!T) ==
SomeStruct!(const(T)).

Also, const for local variables are really only necessary if you got the
data from a function that returns const; otherwise, it's practically no
different from immutable and you might as well use immutable for
stronger guarantees.

[...]
> Personally, I generally ignore 'immutable' (except, implicitly in
> 'string') both for parameters and local data.
[...]

Immutable is powerful because it has very strong guarantees.
Unfortunately, these very strong guarantees also make its scope of
applicability extremely narrow -- so narrow that you rarely need to use
it. :-D

T

-- 
Frank disagreement binds closer than feigned agreement.