Overloading operators by operator symbol

Mon Oct 30 18:03:37 PST 2006

Walter Bright wrote:
> Bill Baxter wrote:
>> I'm not a big fan of magic operator method names.  Python has its 
>> __add__ etc methods, Lua has very similar, D has opAdd etc.
>> Personally I prefer C++'s way of just using the syntax itself.  I find 
>> it a lot easier to remember and it looks less "magical".
> 
> The reasons for "opAdd" instead of "operator+" are:
> 
> 1) opAdd is eminently more greppable. Try grepping for operator+:
> 
>     operator /* comment */ + (T t)
> 
>     operator +\
>     +()
> 
>     operator+(T t)
> 
> Oops, I found operator/ instead! I thought operator++ was operator+! Is 
> that a unary + or a binary +? You practically need a full C++ front end 
> to do the job correctly. D can do tolerably well with simple grep.

Most of those are just perversities that don't exist in real code. 
/operator\s*+[^+]/ would find you 99% of all real use cases.
On the other hand say I want to find all operator overloads period. 
With C++ I can pretty much just grep for 'operator', whereas for D I'd 
have to be a little smarter, because just grepping for 'op' is likely to 
turn up lots of cruft. Ok /\Wop[A-Z]/ would probably do a decent job 
where \W is the 'not a word character pattern'.  Either way I think this 
one is pretty much a wash.  It's not that hard to grep for either one.

> 
> 2) opAdd looks like "opAdd" in the object symbol table rather than "?H" 
> (I am not making up ?H, it really is that) giving one a clue without 
> needing a decoder ring.

I guess that's nice for the compiler writer.  Does it affect the user 
somehow too?  Because I'm not usually so concerned about how things look 
in the symbol table given all the name mangling going on everywhere.

Besides, couldn't one arrange things so that 'operator+' appeared in the 
symbol table as something like '__operator_plus' if one so desired?

> 3) it encourages the use of operating overloading for arithmetic 
> purposes, rather than "parse this predicate once", which happens with 
> C++ operator overloading.

I suppose.  But I suspect programmers will likely see it as a feature 
they can use no matter what you call it.  C++ books generally recommend 
not overloading + for things that are semantically unrelated to adding, 
but people do it anyway.  Similarly people use static opCall in D as a 
constructor.  If the programmer really wants a succinct syntax for some 
common operation, then they're going to consider operator overloading as 
one of their design choices, no matter what those methods are called.

> 4) operators that are mathematically related can be derived from each 
> other: in C++ the == and != are separately overloadable. Anyone who 
> wants to do mathematical overloading has to do both and take care that 
> one actually is the not of the other. With opEquals, one function can 
> serve both. This makes more of a difference with <, <=, >, >= where 4 
> overloads are replaced by opCmp.

Well, operator < alone is used in C++, and via similar mathematical 
identities you can construct <=, >, >= out of it.
given a < b operator we have:

a > b  ===  b < a
a <= b === !(b<a)
a >= b === !(a<b)

> 5) Note C++'s inability to distinguish operator[] as an lvalue and as an 
> rvalue. D has opIndexAssign and opIndex.

Seems C++ does ok there:
type& operator[]() { }      // lvalue case
type operator[]() { } const // rvalue case

> 6) Note the kludge-o-matic C++ overloading of operator++ and its 
> different meanings. I can never remember which is which without looking 
> it up. D has opAddAssign and opPostInc.

Yeh, that is super hacky and hard to remember.  Maybe C++ should have 
added 'loperator' to distinguish left from right.

Really this 'hard to remember' point is the main reason I think symbols 
for operator overloads would be superior.  Something like this, though I 
realize totally hopeless, would nonetheless be nice:

(Let 'i' mean 'this', though 'this' could be used instead.
  Let 'x' (or any non-i letter) mean the other thing where needed.)

operator[-i]   -- opNeg
operator[+i]   -- opPos
operator[~i]   -- opCom
operator[i++]  -- opPostInc
operator[i--]  -- opPostDec
operator[i+]   -- opAdd
operator[x+i]  -- opAdd_r
operator[i==]  -- opEquals
operator[i+=]  -- opAddAssign
operator[i in] -- opIn
operator[in i] -- opIn_r
operator[i[]]  -- opIndex
operator[i[]=] -- opIndexAssign
operator[i[..]] -- opSlice
operator[i[..]=] -- opSliceAssign
etc...

Then I don't have to remember what name the language chose to represent 
the operator, I just have to remember the syntactical situation in which 
I want that operator to be invoked.

I realize it's unconventional.  (I've never seen such a thing in a 
language before -- maybe haskell comes close.)  But I've always been 
annoyed by operator overloading in the languages I've used.  Why not 
just make the operator declaration show the exact use case where the 
operator is invoked??

The above could even be implemented as some sort of preprocessor. It's 
just pure syntactic sugar for the more cryptic built-in method names. 
For opCmp, basically you'd only allow operator[i>] and then say it 
should return positive if i>, zero if equal, and - if less than.

The only issue is I think opApply / opApplyReverse, and there the 
problem is that these are not really operators to begin with, they're 
iterators.  Unlike an operator they have no associated syntax.

--bb