Operator overloading

Sat Dec 27 02:51:02 PST 2008

Andrei Alexandrescu wrote:
> Bill Baxter wrote:
>> On Sat, Dec 27, 2008 at 9:42 AM, The Anh Tran <trtheanh at gmail.com> wrote:
>>> aarti_pl wrote:
>>>> Andrei Alexandrescu pisze:
>>>>  > We're trying to make that work. D is due for an operator overhaul.
>>>>  >
>>>>  > Andrei
>>>>
>>>> Is there any chance that we get possibility to overload "raw 
>>>> operators",
>>>> like in C++? I think that they may coexist with currently defined 
>>>> operator
>>>> overloads with simple semantic rules, which will not allow them to work
>>>> together at the same time.
>>>> ..........
>>>> BR
>>>> Marcin Kuszczak
>>>> (aarti_pl)
>>> Me also have a dream :D
>>>
>>> <Daydream mode>
>>> class Foo
>>> {
>>>        auto op(++)(); // bar++
>>>        auto op(++)(int); // ++bar
>>>
>>>        op(cast)(uint); // cast(uint)bar // opCast
>>>        auto op(())(int, float); // Foo(123, 123.456) // opCall
>>>
>>>        auto op(+)(Foo rhs); // bar1 + bar2
>>>        auto op(+=)(int); // bar += 1234;
>>>        auto op(.)(); // bar.xyz // opDot
>>>
>>>        Foo op([][][])(int, char, float); // bar[123]['x'][123.456]
>>>
>>>        auto op([..])(); // i = bar2[] // opSlide
>>>        auto op([..])(int, int); // bar[1..10]
>>>
>>>        auto op([..]=)(float); // bar[] = 12.3 //opSlideAssign
>>>        auto op([..]=)(int, int, float); // bar[1..3] = 123.4
>>> }
>>> </Dream>
>>
>> When I suggested this kind of thing long ago, Walter said that it
>> encourages operator overload abuse, because it suggests that  + is
>> just a generic symbolic operator rather than something that
>> specifically means "addition".  That's why D uses "opAdd" instead.
>> It's supposed to encourage only creating overloads that follow the
>> original meaning of the operator closely.  That way when you see a+b
>> you can be reasonably sure that it means addition or something quite
>> like it.
> 
> I think that argument is rather weak and ought to be revisited. It's 
> weak to start with as if writing "+" in a D program hardly evokes 
> anything else but "plus". What the notation effectively achieved was put 
> more burden on the programmer to memorize some names for the 
> already-known symbols. I think the entire operator overloading business, 
> which started from a legitimate desire to improve on C++'s, ended up 
> worse off.

I feel quite strongly that C++'s operator overloading was a failed 
experiment. The original intention (AFAIK) was to allow creation of 
mathematical entities which could use natural syntax. The classic 
example was complex numbers, and it works reasonably well for that, 
although it requires you to create an absurd number of repetitive functions.

But for anything much more complicated, such as matrices, tensors, big 
integer arithmetic, etc -- it's an abject failure. It's clumsy, and 
creates masses of temporary objects, which kills performance so 
completely that it's unusable. But the whole point of operator 
overloading was to allow nice notation in a performace-oriented 
language! Expression templates are basically a hack to restore 
performance in most cases, but it comes at a massive cost in simplicity.
And the performance even then is not always optimal.

I think that Walter's idea, in tightening the semantics of overloaded 
operators, is the right approach. Unfortunately, it doesn't go far 
enough, so we get the worst of both worlds: the C++ freedom is 
curtailed, but there isn't enough power to replace it.

Ultimately, I think that the problem is that ideally, '+' is not simply 
a call to a function called 'plus()'. What you'd like an operator to 
compile to, depends on the expression in which it is embedded. For 
maximum performance, an expression needs to be digested before it is 
converted into elementary functions.

In my 'operator overloading without temporaries' proposal in Bugzilla,
I showed that DEFINING a -= b as being identical to a = a - b, and then 
  creating a symmetric operation for a = b - a allows optimal code 
generation in a great many cases. It's not a complete solution, though.

In particular, irreducible temporaries need more thought. Ideally, in 
something like a += b * c + d, b*c would be created in a memory pool, 
and deleted at the end of the expression.
(By contrast, a = b*c+d, would translate to a=b*c; a+=d; so no temporary 
is required).

There are other, less serious problems which also need to be addressed.

Defining ++a as a+=1 is probably a mistake. It raises lots of nasty issues.
* If a is a complex number, a = a + 1 makes perfect sense. But it's not 
obvious that ++a is sensible.
* What type is '1'? Is it an int, a uint, a long, ... You don't have 
that issue with increment.

As I see it, there are two possible strategies:
(1) Pursuing optimal performance, which requires semantic tightening, 
and reduced flexibility, or
(2) Pursure simplicity and semantic flexibility, sacrificing performance.

I think those two possibilities are mutually exclusive.