operator overloading outside the type
Jonathan M Davis
newsgroup.d at jmdavisprog.com
Fri Mar 28 20:54:05 UTC 2025
On Friday, March 28, 2025 11:28:10 AM MDT sfp via Digitalmars-d wrote:
> On Friday, 28 March 2025 at 08:57:39 UTC, Jonathan M Davis wrote:
>
> Since I'm still fairly fresh to D, I want to try to understand
> which parts of what you wrote are your carefully considered
> opinion and which are hard fact, because it actually isn't very
> clear to me...
>
> > Overloaded operators are not normal functions, and they're not
> > used like normal functions.
>
> Is there any way that they aren't normal functions apart from the
> different, unique syntax?
The syntax is completely different, and their purpose is to either make a
type function more like a built-in type or to overload built-in behavior,
whereas functions are added behavior that is not overriding any built-in
behavior.
And in at least some cases, it's critical that overloaded operators be
part of the type for the language to work, because language features use
them, and having them defined in some stray module somewhere won't work.
opEquals is a prime example of this. The language is designed such that it
expects objects to be comparable, and there is plenty of code in places like
druntime which will just assume that the operator exists and use it, and
that will never work with any form of operator overloading, because the
druntime code won't have any clue about an opEquals that is in some stray
module somewhere instead of being part of the type itself.
toHash would be another example. You can't use toHash without having
declared it, but druntime's hashOf will use a default implementation as long
as opEquals isn't explicitly defined (and gives an error if it is, since it
has no guarantee that your opEquals is consistent with the default toHash).
And hashOf - and the internals of the AA implementation - isn't going to
have any clue about a toHash that you decide to declare in some module
somewhere. And even if it somehow did, what if you declared a toHash in one
module and another toHash in another module and then passed the AA around?
What would the AA even use? And remember that it can only have one version.
It's not like you're getting different AA types based on which module you
declare them in and which imports you use. Foo[Bar] is always Foo[Bar]. It
can't change based on imports.
Unlike with normal functions, many of the uses of some of the overloaded
operators are completely outside the control of the programmer outside of
simply avoiding language features, because some language features require
specific overloaded operators, and they need the operator to be associated
with the type to work.
> > Overloaded operators already have a number of restrictions on
> > them to try to enforce that they behave in a sensible way and
> > aren't abused. They're special and IMHO should be treated as
> > such.
>
> It would be really helpful to have a list of these restrictions,
> other than the main one being discussed (that they must be
> defined as member functions). I'm not aware of any others.
D's overloaded operators are designed in such a way as to try to prevent
abuses that are sometimes common in C++. For instance, rather than declaring
every comparison operator as a different function, we have only two -
opEquals and opCmp. opEquals is used to defined both == and != so that
they're always consistent, whereas in C++, they could actually do completely
different things. Similarly, opCmp returns a value which is compared against
0 to see whether the left-hand value is less than, equal to, or greater than
the right-hand value so that <, <=, >=, and > can all be created from the
same function, whereas in C++, you'll get nonsense like folks trying to
create <- from the < and - operators, because they can all be overloaded
individually. You also don't have the guarantee that all of the associated
operators were overloaded if one of them was.
Another example is that both pre and post increment are defined from a
single overloaded operator so that they have to do the same thing, and the
compiler does the logic for pre vs post, which also means that the compiler
can rely on them doing the same thing, making it so that it can freely
replace a post-increment operation with a pre-increment operation when the
result isn't used, whereas C++ can't make that assumption, because they're
separate operators with their own implementations.
Of course, not all abuses of overloaded operators are prevented by D, but
D's overloaded operators are designed to work in a restricted manner to
reduce the possibility of abuse, and they're purposefuly not as flexible in
what you can do with them as what you can do with a normal function.
> > They're supposed to be a core part of the type, not just a
> > function that accepts the type.
>
> Why? I hear this claimed a lot but because I completely disagree
> with this point it's hard for me to understand what its basis is.
How overloaded operators do or don't work with a type are a core part of its
design and affect how it interacts with the language in general, including
how it interacts with language features, whereas a normal function isn't
anything special and isn't treated by the language in any special way.
This is especially true with operators that overload the default behaviors
of the language for a type.
In such cases, trying to define an operator externally would be like trying
to define a constructor externally. It doesn't work, because the language
and runtime need it all tied up with the type so that it's always present
and consistent instead of depending on what you imported or not.
> > Imagine the weird side effects that you'd get and hard to track
> > down bugs if something like opCast were overloaded externally.
>
> Sounds scary, but my imagination isn't good enough. Can you
> please provide an example?
Would you want to deal with code where the way that opCast works changes
based on the imports that you use? And what if someone overrides the normal
casting behavior? For instance,
```
void main()
{
import std.stdio;
writeln(cast(Bar) Foo.init);
}
struct Foo
{
int i;
}
struct Bar
{
int i;
}
```
will print out
```
Bar(0)
```
But if you add a cast operator to Foo, e.g
```
struct Foo
{
int i;
auto opCast(T : Bar)()
{
return Bar(42);
}
}
```
then in this case, you'd get
```
Bar(42)
```
instead. Here, that's controlled by the type and so is unaffected by
imports, but if you could overload opCast externally, then it _would_ be
affected by imports and code that was written to work in a particular way
would change in potentially drastic and silent ways due to an import - one
that could have been added in order to do something unrelated but
accidentally brought along the opCast.
This is in sharp contrast to functions which are not built-in in any way, do
not get automatically generated in any fashion, and are designed to be
overloadable and have mechanisms for dealing with conflicts. Operators are
designed to be part of the type, and the language treats them that way.
And with something like opCast, you _really_ want it to be doing the
expected thing. Casts are error-prone enough when you use them correctly,
let alone what we'd get if they started being affected by imports -
especially with how blunt of an instrument they are.
> > And some operators clearly can't be overloaded externally -
> > such as opEquals - because the compiler generates the code for
> > them if they're not there. Others simply couldn't work in any
> > sane fashion if they were external (e.g. opDispatch).
>
> Are these hard technical restrictions?
For some operators like opEquals, they're so strongly tied to the language
that the compiler will generate code that uses them, and it will use that
code in places where the programmer has no control and can't possibly import
anything. So, those clearly aren't overridable.
For opDispatch specifically, it _might_ be possible, but if it is, it would
be a disaster in the making. opDispatch is already pretty ugly with it being
on the type itself. It basically takes over _all_ matching member function
calls when you try to call a member function that isn't a member function -
so if opDispatch's parameters match the "function" that you're trying to
call, it calls opDispatch instead. UFCS and opDispatch already don't get
along very well as a result. opDispatch always wins if there's a conflict,
but it's pretty easy to think that you're going to be getting UFCS and then
get opDispatch instead depending on how well you understand it and what
you're trying to do. But what on earth would the rules be if opDispatch were
external? Would opDispatch still always win? Would it result in a symbol
conflict? What if there were multiple opDispatches? It's a function that's
designed to work when there are no other matches, and if you ever have to
call it explicitly, it's drastically different from calling it as intended.
Even if it's possible to make it work, it's just going to be a confusing
mess - and opDispatch is already confusing enough when it's directly on the
type.
> > IMHO, if you want to add operators to a type, then wrap it. It
> > doesn't require making the language rules any more weird or
> > confusing, and it's straightforward.
>
> It imposes a large burden on the programmer... One of the selling
> points of D is that it's high leverage (get a lot done while
> writing few lines of code). Enabling free operator definitions is
> in line with this feature. Restricting it is "straightforward"
> but is more in line with a language like Go: bondage and
> discipline in the name of keeping things regular and predictable,
> but at the cost of having to write many more lines of code.
I don't see how it imposes any real burden on the programmer to require that
operators be on the type itself. They're a language feature that the
programmer is hijacking or emulating, not simply function calls with
different names or syntax.
> As for "weird or confusing"... it's completely subjective...
True, but if you start thinking through what the current semantics are in
detail of each of the operators and where they're used, a number of them
either cannot be overloaded if they're not part of the type, or it would
cause a number of problems if it were allowed - especially with any
operators that the language or runtime use.
Some operators - e.g. +, -, *, and / - might not be a total disaster if they
were external, because they aren't used by the language or runtime and
aren't assumed to exist, but there also isn't much to gain by allowing it,
particularly since they're not features that are designed with symbol
resolution or conflicts in mind. It's also simpler if we can just say that
no operators can be overloaded, because then it's easy to understand rather
than having to explain with each and every operator why it does or doesn't
make sense to allow it.
And honestly, I wouldn't want to deal with the bugs which would come with
attempting to allow external operators at this point. We get nasty bugs
every time that anything is done to the compiler which relates to symbol
resolution or imports, and we're talking about features that were explicitly
designed and implemented with the idea that they were strictly part of the
type and therefore did not play into any of the symbol resolution or
conflict issues at all. This is especially true with any operators which can
be used in any fashion without overloading them first.
- Jonathan M Davis
More information about the Digitalmars-d
mailing list