Object.toString, toHash, opCmp, opEquals

Fri Apr 26 22:55:32 UTC 2024

On Friday, April 26, 2024 2:17:09 PM MDT Walter Bright via Digitalmars-d 
wrote:
> D1 is an example of a language with no attributes and no const. D1 works as
> a good programming language.
>
> But it gives the programmer no indication of whether the arguments get
> mutated or not. He'll have to read and understand the called function, as
> well as the functions it calls.
>
> It is reasonable to use const parameters when the argument is not going to
> be mutated. I personally prefer to use that as much as possible, and I like
> very much that the compiler will enforce it. With the mutating 4 functions,
> I cannot use const class objects.

You can do that today. E.G. This code compiles and runs just fine.

```
void main()
{
    static class C
    {
        int i;

        this(int i)
        {
            this.i = i;
        }

        override bool opEquals(Object rhs)
        {
            if(auto r = cast(C)rhs)
                return opEquals(r);
            return false;
        }

        bool opEquals(const C rhs) const
        {
            return this.i == rhs.i;
        }

        override int opCmp(Object rhs)
        {
            if(auto r = cast(C)rhs)
                return opCmp(r);

            throw new Exception("Cannot compare C with types that aren't C");
        }

        int opCmp(const C rhs) const
        {
            if(this.i < rhs.i)
                return -1;
            if(this.i > rhs.i)
                return 1;
            return true;
        }

        override string toString() const
        {
            import std.format : format;
            return format!"C(%s)"(i);
        }

        override size_t toHash() const @safe nothrow
        {
            return i;
        }
    }

    const c1 = new C(42);
    const c2 = new C(99);

    assert(c1 == c1);
    assert(c1 != c2);

    assert(c1 <= c1);
    assert(c1 < c2);

    assert(c1.toHash() == 42);

    import std.format : format;
    assert(format!"c1: %s"(c1) == "c1: C(42)");
}
```

All four functions worked with const references. What does not work is if
you use const Object references.

    assert(c1 == c2);

gets lowered to the free function, opEquals:

    assert(opEquals(c1, c2));

and because that function is templated, the derived class overloads get
used. This is what happens in almost all D code using classes. The exception
is code that uses Object directly, and pretty much no code should doing
that.

Java passes Object all over the place and stores it in data structures such
as containers, because Java doesn't have templates. For them, generic code
has to operate on Object, because they don't have any other way to do it. In
sharp contrast, we have templates. So, generic code has no need to operate
on Object, and as such, it has no need to call opEquals, opCmp, toHash, or
toString on Object. As it is, Object's opCmp throws, because there are
plenty of classes where opCmp doesn't even make sense, and there really
isn't a way to give Object an implementation that makes sense.

Generic code in D is templated, and as such, we can do stuff like we've done
with the free funtion, opEquals, and make it so the code that needs to
operate on classes generically operates on the exact type that it's given
instead of degrading to Object. And as such, we don't need any of these
functions to be on Object.

It's already the case that code like format and writeln operate on the
actual class type that they're given and not Object. You already saw that
when you talked about using alternate overloads for toString.

D code in general does not operate on Object. AFAIK, the main place that
anything in D operates on Object at present is in old druntime code that has
yet to be templated. And if that code is templated, the need to have these
functions on Object goes away entirely. Then the entire debate of which
attributes these functions should have goes away. Classes can define them in
whatever way the programmer sees fit so long as the parameters and return
types match what's necessary for them to be called by the code that uses
these functions - just like happens with structs. The only difference is
that the derived classes within a particular class hierarchy will have to be
built on top of whatever signatures those functions were given on the first
class in the hierarchy that had them, whereas structs don't have to worry
about inheritance. But those signatures can then be whatever is appropriate
for that particular class hierarchy instead of trying to come up with a set
of attributes that make sense for all classes (which isn't possible).

And given that Object really doesn't need to have any of these functions, we
likely would have removed them years ago if it weren't for the fact that
something like that would break code (in large part due to the override
keyword; a lot of the code would have worked just fine with those functions
being removed from Object if the derived classes didn't have to have
override, which will then become an error when the base class version of the
funtion is removed). Andrei also proposed ProtoObject as a way to change the
class hierarchy so that we could remove these functions (as well as the
monitor) from classes without breaking code built on top of Object. So,
we've known for years that we could fix this problem if we could just remove
these functions from Object.

Editions gives us a way to make breaking changes in a mangeable manner. This
should give us the opportunity to remove these four functions from Object
like we've discussed for years and couldn't do because it would break code.

And if we decide to not do that, putting const on these four functions would
actually make the situation worse. Yes, you could then call those four
functions on const Objects, but it would mean that every single class will
be forced to have these functions even if they cannot actually implement
them properly with const. And what do such types do at that point? Do they
throw an exception?

```
    static class C
    {
        Mutex mutex;
        shared int* ptr;

        this()
        {
            this.ptr = new shared int;
        }

        override bool opEquals(const Object rhs) const
        {
            throw new Exception("C does not support const")
        }

        bool opEquals(C rhs)
        {
            mutex.lock();
            immutable left = cast()*this.ptr;
            mutex.unlock();

            rhs.mutex.lock();
            immutable right = cast()*rhs.ptr;
            rhs.mutex.unlock();

            return left == right;
        }
    }
```

Do they cast away const and mutate?

```
    static class C
    {
        Mutex mutex;
        shared int* ptr;

        this(int i)
        {
            this.ptr = new shared int;
        }

        override bool opEquals(const Object rhs) const
        {
            if(auto r = cast(C)rhs)
                return (cast()this).opEquals(r);
            return false;
        }

        bool opEquals(C rhs)
        {
            mutex.lock();
            immutable left = cast()*this.ptr;
            mutex.unlock();

            rhs.mutex.lock();
            immutable right = cast()*rhs.ptr;
            rhs.mutex.unlock();

            return left == right;
        }
    }
```

Do they just have different behavior in the Object overload?

```
    static class C
    {
        Mutex mutex;
        shared int* ptr;

        this(int i)
        {
            this.ptr = new shared int;
        }

        override bool opEquals(const Object rhs) const
        {
            return this is rhs;
        }

        bool opEquals(C rhs)
        {
            mutex.lock();
            immutable left = cast()*this.ptr;
            mutex.unlock();

            rhs.mutex.lock();
            immutable right = cast()*rhs.ptr;
            rhs.mutex.unlock();

            return left == right;
        }
    }
```

You'd have a type which could technically be used as a const Object but
which would not do the correct thing if it ever is. In contrast, right now,
while you can't call any of these functions with a const Object, you _can_
call them on a const reference of the derived type if the derived type has
const on them.

So, the code right now will do the correct thing, and it will work with
const in any normal situation, whereas if we put const on these functions,
such classes will have overloads that will not - and cannot - do the correct
thing. And while those Object overloads would not normally be used, if they
ever are, you have a bug - one which could be pretty annoying to track down,
depending on what the const overload does.

I completely agree with you that _most_ classes should have const on these
functions so that they can work with const references, but not all classes
will work with const, and I don't see why there is any need to make these
functions work with const Objects - const class references, yes, but not
const Objects.

Normal D code does not use Object directly, and we should be able to
templatize what little druntime code there is left which operates on Object
and needs to use one or more of these functions. Once that's done, we can
use an Edition to remove these functions from Object, and this entire issue
goes up in smoke.

Instead, what you're proposing also causes breakage, but it puts perfectly
legitimate use cases in a situation where they have to implement functions
which they literally cannot implement properly. And it's for a use case that
normal D code shouldn't even be doing - that is operating on Object instead
of whatever derived types the code base in question is actually using. D is
not Java. It may have made sense to put these functions on Object with D1,
but with D2, we have a powerful template system which generally obviates the
need to operate on Object. We shouldn't need to have these functions on
Object, and Editions should give us what we need to remove them in a
manageable way.

- Jonathan M Davis