Overloading operators by operator symbol

Tue Oct 31 13:28:07 PST 2006

Bill Baxter wrote:
> Walter Bright wrote:
>> Oops, I found operator/ instead! I thought operator++ was operator+! 
>> Is that a unary + or a binary +? You practically need a full C++ front 
>> end to do the job correctly. D can do tolerably well with simple grep.
> Most of those are just perversities that don't exist in real code. 
> /operator\s*+[^+]/ would find you 99% of all real use cases.

Perhaps you're right, but I sure get tired of things in C++ that work 
only most of the time (and I didn't even get into what the preprocessor 
can do to any reliance on grep). I like things to work reliably. I want 
to make sure I found all the operator overload cases when I do a code audit.

There's a thread on comp.lang.c about writing a program that can convert 
C++ // comments to /* */ comments. Most of the thread is about all the 
weird corner cases (like trigraphs, line splicing, etc.) that can happen 
in C++ and how doing a correct job of it is far more complicated than it 
looks like it should be. This is not unusual, but typical of C++ source 
code analysis problems.

>> 2) opAdd looks like "opAdd" in the object symbol table rather than 
>> "?H" (I am not making up ?H, it really is that) giving one a clue 
>> without needing a decoder ring.
> I guess that's nice for the compiler writer.  Does it affect the user 
> somehow too?  Because I'm not usually so concerned about how things look 
> in the symbol table given all the name mangling going on everywhere.

How they look in the symbol table matters when you're having problems 
getting things to link properly or getting error messages from the 
linker or looking at exported names from a DLL or using a debugger 
without full debug info or using a disassembler, etc.

> Besides, couldn't one arrange things so that 'operator+' appeared in the 
> symbol table as something like '__operator_plus' if one so desired?

Yes, one could. But it's one less level of indirection to connect 
"opAdd" in the symbol table with "opAdd" in the source code.

>> 3) it encourages the use of operating overloading for arithmetic 
>> purposes, rather than "parse this predicate once", which happens with 
>> C++ operator overloading.
> I suppose.  But I suspect programmers will likely see it as a feature 
> they can use no matter what you call it.  C++ books generally recommend 
> not overloading + for things that are semantically unrelated to adding, 
> but people do it anyway.  Similarly people use static opCall in D as a 
> constructor.  If the programmer really wants a succinct syntax for some 
> common operation, then they're going to consider operator overloading as 
> one of their design choices, no matter what those methods are called.

Programmers can and will do whatever they want, but it helps to 
encourage correct usage by following the dictum "if it looks wrong, it 
probably is wrong". And overloading opAdd to be "parse" is going to look 
wrong, wrong, wrong.

>> 4) operators that are mathematically related can be derived from each 
>> other: in C++ the == and != are separately overloadable. Anyone who 
>> wants to do mathematical overloading has to do both and take care that 
>> one actually is the not of the other. With opEquals, one function can 
>> serve both. This makes more of a difference with <, <=, >, >= where 4 
>> overloads are replaced by opCmp.
> 
> Well, operator < alone is used in C++, and via similar mathematical 
> identities you can construct <=, >, >= out of it.
> given a < b operator we have:
> 
> a > b  ===  b < a
> a <= b === !(b<a)
> a >= b === !(a<b)

I know you can construct those identities in C++, but the point is you 
have to manually construct them every time, which is tedious and a 
source of error. C++ won't do it for you.

>> 5) Note C++'s inability to distinguish operator[] as an lvalue and as 
>> an rvalue. D has opIndexAssign and opIndex.
> 
> Seems C++ does ok there:
> type& operator[]() { }      // lvalue case
> type operator[]() { } const // rvalue case

That's by learned and commonly followed convention, not by design. Even 
worse, the lvalue case is restricted to only allow assignment through 
the reference - making it impossible to have an lvalue case where some 
post processing needs to be done with the new contents of the lvalue.

> Really this 'hard to remember' point is the main reason I think symbols 
> for operator overloads would be superior.  Something like this, though I 
> realize totally hopeless, would nonetheless be nice:
> 
> (Let 'i' mean 'this', though 'this' could be used instead.
>  Let 'x' (or any non-i letter) mean the other thing where needed.)
> 
> operator[-i]   -- opNeg
> operator[+i]   -- opPos
> operator[~i]   -- opCom
> operator[i++]  -- opPostInc
> operator[i--]  -- opPostDec
> operator[i+]   -- opAdd
> operator[x+i]  -- opAdd_r
> operator[i==]  -- opEquals
> operator[i+=]  -- opAddAssign
> operator[i in] -- opIn
> operator[in i] -- opIn_r
> operator[i[]]  -- opIndex
> operator[i[]=] -- opIndexAssign
> operator[i[..]] -- opSlice
> operator[i[..]=] -- opSliceAssign
> etc...
> 
> Then I don't have to remember what name the language chose to represent 
> the operator, I just have to remember the syntactical situation in which 
> I want that operator to be invoked.

You'd have to remember the funky syntactical oddities for each operator 
in the above notation (note that it's inconsistent). I don't think 
there's any real improvement.

> The only issue is I think opApply / opApplyReverse, and there the 
> problem is that these are not really operators to begin with, they're 
> iterators.  Unlike an operator they have no associated syntax.

Using the opXxxx convention does enable the overloading of operations 
that do not have an obvious operator symbol.