mutable, const, immutable guidelines

Wed Oct 2 13:15:06 PDT 2013

On Wednesday, 2 October 2013 at 17:07:55 UTC, Ali Çehreli wrote:
> On 10/02/2013 06:09 AM, Daniel Davidson wrote:
>
> > 1. If a variable is never mutated, make it const, not
> immutable.
> > 2. Make the parameter reference to immutable if that is how
> you will use
> > it anyway. It is fine to ask a favor from the caller.
> > ...
> >
> > If you follow (1) exclusively, why the need for immutable in
> the
> > language at all?
>
> For one, immutable is thread-safe.

Ok - then new guidelines should be added to the effect that, "if 
you want a guarantee of read-only that can be counted on across 
threads, use immutable". But, I'm not sure that is a sensible 
guideline or when you know up-front to deal with it. Often 
threading is an afterthought. My point is not that immutable is 
not useful, but that if you never make an lvalue immutable in the 
guidelines (via guideline 1), it is hard to see them ever being 
used at all. The case in the slides where it was used was to when 
a method required it on a parameter type with mutable aliasing. 
In this case the client was required to copy beforehand. In other 
words, I think even the summary simple guidelines are not so 
simple, maybe because they are incomplete. I would love to help 
complete them.

> Also, an object can safely hold on to a piece of data and trust 
> that it will never change.

I think for this to be true in general, the piece of data must be 
immutable, not just tail-immutable. If the class holds onto a 
string (i.e. only tail-immutable), the class can add to the 
string - just not change the existing characters.

> A file name is a good example: An object need not copy a string 
> that is given to it as a constructor parameter. That string 
> will be immutable as long as that reference is valid.
>

IMHO file name and ultimately string are the *worst* example :-) 
The reason is they are a special case because they provide *safe 
sharing*. String encapsulates its contiguous data and does not 
allow mutation of elements. But it does allow concatenation that 
by design detail can not be seen by other references to the same 
data because of copy-on-write. Granted, if you have a handle to  
immutable(T)[] you know *it* (i.e. the complete string) will not 
change (unless you change it). The problem is it is just a 
byproduct of the implementation details of immutable(T)[]. From 
the referrer's point of view it is immutable.

For example, the following comparable associative array does not 
have the same benefit as string since two references to the same 
underlying data *do see changes*:

     import std.stdio;

     struct V {
       string s = "I'm a value";
     }

     alias immutable(V)[int] Map;

     void main() {
       Map m = [ 1:V(), 2:V("bar") ];
       Map m2 = m;
       m2[3] = V("moo");
       writeln(m);
       writeln(m2);
     }

The reason I think string is a bad example is one might 
incorrectly assume tail-immutable is good enough for true 
read/only thread-safe data. And it is for string or any 
immutable(T)[], but not in general.

> > Maybe it is a philosophical question, but where does
> > immutability really come from? Is it an aspect of some piece
> of
> > data or is it a promise that function will not change it? Or
> is
> > it a requirement by a function that data passed not be changed
> > by anyone else?
>
> The two concepts are necessarily conflated in languages like 
> C++ because for those languages there is only the 'const' 
> keyword.
>
> D's immutable is the last point you made: a requirement by a 
> function that data passed not be changed by anyone else?
>
> > I found the end of the video amusing, when one gentleman
> looking at a
> > rather sophisticated "canonical" struct with three overloads
> for
> > 'this(...)' and two overloads for opAssign, asked if all
> those methods
> > were required.
>
> I think you are referring to Ben Gertzfield's question. He 
> later told me that he was sorry that he asked the wrong 
> question at the wrong time. :D
>

I can understand - I would not want to put you on the spot on 
stage either. But, I have no qualms doing it in the news groups 
:-)

> To be honest, that slide was added more as an after thought. I 
> suspect that that canonical struct is simpler today after 
> changes made to dmd since the conference. (I will look at it 
> later.) For example, there are no 'pure' keywords around in 
> that code. I think the (ability to) use of that keyword will 
> simplify matters.
>
> However, I should have answered Ben's question by the following:
>
> * If the struct is simply a value type or has no mutable 
> indirections, no member function is really necessary. D takes 
> care of it automatically.
>

Which I wonder if is a good idea. Immediately from that comes the 
problem that "D takes care of it" breaks when you go from no 
mutable aliasing to some mutable aliasing. That mutable aliasing 
could be deep down in the composition chain. I know you can feel 
this issue since your slides point out cases where code may break 
when well encapsulated structs make changes that should not 
impact client code but will break them.

> * Sometimes post-blit will be necessary for correctness. For 
> example, we may not want two objects share the same internal 
> buffer, File, etc.
>

Agreed - but post-blits are easier said than done with nested 
data. For instance, if you have a member that is a struct that 
does not provide a post-blit (i.e. the original author did not 
concern himself with sharing) then you have to effectively write 
his post-blit for him in your post-blit. This is why I lobby for 
language supported generalized dup.

> * As noted in the presentation, the default behavior of struct 
> assignment in D is exception-safe: first copy then swap. In 
> some cases that automatic behavior is less than optimal e.g. 
> when an object's existing buffer can be reused; no need to 
> "copy then swap (which implicitly destroys)" in that case. (The 
> spec or Andrei's book has that exact case as an example.) So, 
> opAssign should be for optimization reasons only.
>
> (I will look at these again later.)
>

I look forward to it and would be glad to help get to a good set 
of guidelines. For instance filling out this, which only covers 
variable declarations - not parameter passing, with a robust 
"when to use".

| context         | T | C(T) | I(T) | I(T)[] |
|-----------------+---+------+------+--------|
| local stack     |   |      |      |        |
| local heap      |   |      |      |        |
| global          |   |      |      |        |
| instance member |   |      |      |        |
| static member   |   |      |      |        |

You could actually have two tables - one for T1 with no mutable 
aliasing and one for T2 with mutable aliasing. But relying on 
that in the decision matrix may lead to issues.

Thanks,
Dan