Store mutable indirections in immutable data with this one weird trick!

Paul Backus snarwin at gmail.com
Sat Nov 13 06:50:48 UTC 2021


In the recent [discussion about reference counting][1], a 
question that's repeatedly come up is, how can we store a pointer 
to immutable data in immutable memory?

So far, suggested solutions have included:

- Add a `__mutable` qualifier that functions as a "back door" to 
`immutable` and `const`.
- Store the mutable data at some offset outside the object's 
declared layout, and access it with pointer arithmetic.
- Store the mutable data in a global associative array.

All of these solutions have unsatisfying tradeoffs: some are 
incompatible with `pure`; others require language changes to work 
without causing undefined behavior. It seems inevitable that we 
will have to accept some kind of compromise--our only choice is 
which one.

Except...what if we don't? What if the One Weird Trick that will 
give us everything we want has been sitting under our noses all 
along?

I present: `TailUnqual`!

```d
import tail_unqual;

void main() @safe pure
{
     // Qualifiers on a TailUnqual variable are not 
transitive--that is,
     // they apply only to the variable itself, not the data it 
points to.
     immutable TailUnqual!(int*) p = new int(123);
     assert(*p == 123);

     // We can't mutate `p` itself, because it's immutable...
     assert(!__traits(compiles,
         p = immutable(TailUnqual!(int*))(new int(789))
     ));

     // ...but we can mutate the data it points to!
     *p = 456;
     assert(*p == 456);
}
```

How does it work? How can I possibly break such a fundamental 
language rule and get away with it? As it turns out, the answer 
is almost embarrassingly simple:

* You're allowed to cast a pointer to an integer, and then cast 
that integer back to the same pointer.
* You're allowed to convert integer values freely between 
mutable, const, and immutable.

So, all we have to do is (a) store the pointer as an integer, and 
(b) convert it *back* to a pointer every time we access it. Add a 
little bit of `union` seasoning to satisfy the GC, and you get 
the embarrassingly simple implementation behind this magic trick:

```d
module tail_unqual;

import std.traits: isPointer;

private union HiddenPointer
{
     // Stores a pointer, cast to an integer
     size_t address;

     // Enables GC scanning (and prevents some other UB)
     void* unused;
}

struct TailUnqual(P)
     if (isPointer!P)
{
     private HiddenPointer data;

     this(P ptr) inout
     {
         data.address = cast(size_t) ptr;
     }

     @trusted @property
     P ptr() inout
     {
         // This is safe because:
         //  - `address` is always the active field of `data`
         //  - `data.address` was originally cast from a `P`
         return cast(P) data.address;
     }

     alias ptr this;
}
```

[Try it yourself on `run.dlang.io`.][2]

What do you think? Is it just crazy enough to work, or just plain 
crazy? Is there some fatal safety violation I've overlooked? Let 
me know in your replies!

[1]: https://forum.dlang.org/thread/smc5jb$2u8t$1@digitalmars.com
[2]: https://run.dlang.io/is/OVTbD3


More information about the Digitalmars-d mailing list