Object.toString, toHash, opCmp, opEquals

Fri Apr 26 03:13:50 UTC 2024

On 4/25/2024 6:32 PM, Timon Gehr wrote:
> A range is useless unless it is mutable. The range interface is inherently 
> mutable. To iterate a range, you have to call `popFront()` on it. There is no 
> way to have a `const popFront()`.

I agree there's no reason to have a const popFront(). But opEquals() is 
inherently non-mutable. Let's posit a mutating opEquals() and:

```
o.opEquals(o);
```

and the opEquals() mutated which one, or both, or what would happen if it did?

>> The utility is being able to write borrow-checker style code, so you can avoid 
>> things like double frees.
>> ...
> 
> `@live` does not enable this.

```
auto p = q;
free(p);
free(q);
```

> Anyway, you are trying to impose nonsensical 
> restrictions on garbage-collected code. I have yet to run into a double-free 
> using GC allocation and I doubt `@live` would help me avoid that if it were a 
> thing.

D doesn't distinguish between gc pointers and non-gc pointers. It has been 
proposed, but I have very extensive experience with multiple pointer types and 
it is a cure worse than the disease.

>> As I recall, it was you that pointed out that reference counting can never be 
>> safe if two mutable pointers to the same ref counted object (one to the 
>> object, the other to its interior) were passed to a function. (Freeing the 
>> first can leave the second interior pointer pointing to a deleted object.) The 
>> entire ref counting scheme capsized because of this.
> I provided the counterexample, but the unsound generalization is yours.

All it takes is one counterexample to capsize it.

> (Technically, there would be ways to type check that code without banning 
> mutation outright.)

Neither Andrei nor I nor anyone else working on it could figure out a solution 
(other than disallowing all pointers to payload). The borrow checker does solve 
it, though.

>> Why would anyone need toHash(), toString(), opEquals() or opCmp() to mutate 
>> their data? Wouldn't that be quite surprising behavior?
>>
> 
> As I keep pointing out, there is a difference between mutating abstract data and 
> concrete memory locations. For instance, data types with amortized guarantees 
> usually have to reorganize the internal data representation on each query. 
> (Think e.g. splay trees.)
> 
> Anyway, let's for the sake of argument assume that I want to write functions 
> that leave memory in exactly the state they encountered it in. Const will 
> _still_ unduly restrict me because it is not fine-grained enough.
> 
> ```d
> import std.stdio, std.range, std.conv;
> 
> struct S{
>      auto r=iota(1,2);
>      string toString()const{ return text(r); }

I agree that mutates the argument passed to toString(). That would consume the 
range. Calling toString() again would return an empty string.

> Sometimes there is not even a safe workaround to get a mutable version of a 
> range, because of transitive `const`. A range can have indirections in its 
> implementation.
> This is just one example establishing that `const` is not expressive enough to 
> say _ONLY_ "this will not mutate anything". It also spells: "This code can be a 
> huge pain in the ass at any point in the future for dumb, incidental reasons."
> 
> I really do not want to deal with this. I'd much rather fork Phobos so it uses 
> non-const alternatives to toHash and toString.

I suppose it wouldn't help if I suggest:

```
writeln(text(r));
```

I only proposed the const toString() for Object.toString(), not for struct, 
where indeed you are free to have struct toString() do anything you want.

Class and struct are fundamentally different in that class is a universal 
hierarchy with a common root, and hence we must define what that common root is. 
Struct, on the other hand, is rootless, and hence the user can define it however 
he pleases.

I agree with you that Object shouldn't have had any members, and Andrei and I 
did discuss that, but since it had members, we couldn't really take them away. 
Note that COM classes also have a common root with one member QueryInterface().

> If you expect people to prove properties to an incomplete type system via 
> annotations and to accept unnecessary restrictions, they have to get some value 
> out of it. You also would not go: "Starting from tomorrow, you have to prove to 
> me that you brush your teeth every day. I want video evidence." And then, when I 
> refuse, you can't say: "Why would you not brush your teeth?" This is what this is.
> 
> I caution you to now not miss the forest for the trees and engage in a 
> "tooth-brushing related" argument (e.g., proposing a different range design or 
> something like that). This is an inherent issue. Even if you make the type 
> system more expressive, the annotation overhead is still real, and often 
> uneconomical.
> 
> I am perfectly fine with having some restricted system like Rust for people who 
> want to do safe manual memory management. This would even be useful to me. But 
> this has to be opt-in, based on data structures, and interoperate as seamlessly 
> as possible with the full language.

I think I see your point of view. Mine is a little different. I have 
considerable experience with C. When I see:

```
int foo(T* p);
```

Is p an array? is foo() going to mutate what it points to? Is foo() going to 
free() it? How would I know without reading the implementation? (The 
documentation is always incomplete, wrong, or missing.) Annotations give me 
confidence that I understand what it does. const/ref/scope here answer my 
questions, and the compiler backs it up.

 > One thing I absolutely agree on with Robert is that it should always be
 > _possible_ to write simple @safe D code without any advanced type system
 > shenanigans. I think any design that strays from that principle is bad. This
 > proposed change absolutely torpedoes that.

I agree with Robert, too. I asked him to prepare a list of his proposals so I 
can see what can be done.

P.S. const class Objects are more or less unusable with the non-const toString, 
toHash, opCmp and opEquals.

P.P.S. all of D's annotations are subtractive. This means you can write code 
without annotations and it'll work. But safe, probably not.

P.P.P.S. I almost never write a multiple free bug these days. But that doesn't 
translate to "don't need double free protection", as I spent many years making 
that mistake and tracking them down. I even wrote my own malloc/free debugger to 
help. Eventually, I simply internalized what not to do. But that isn't a 
transferable skill. I can't even explain what I do.

Anyhow, thanks for the food for thought!