Operator overloading through UFCS doesn't work

Sat Oct 13 13:15:29 PDT 2012

On Saturday, October 13, 2012 19:01:26 Tommi wrote:
> Another way to describe my reasoning...
> 
> According to TDPL, if var is a variable of a user-defined type,
> then:
> ++var
> gets rewritten as:
> var.opUnary!"++"()
> 
> Thus, it would be very logical to assume that it doesn't matter
> whether you write:
> ++var
> ...or, write the following instead:
> var.opUnary!"++"()
> ...because the second form is what the first form gets written to
> anyway.
> 
> But, that "very logical assumption" turns out to be wrong.
> Because in D, as it stands currently, it *does* make a difference
> whether you write it using the first form or the second:
> 
> struct S
> {
>      int _value;
> }
> 
> ref S opUnary(string op : "++")(ref S s)
> {
>      ++s._value;
>      return s;
> }
> 
> Now, writing the following compiles and works:
> S var;
> var.opUnary!"++"();
> 
> ...while the following doesn't compile:
> S var;
> ++var;
> 
> This behavior of the language is not logical. And I don't think
> that logic is a matter of preference or taste.

All that TDPL is telling you is that the compiler uses "lowering" on 
overloaded operators to generate their code. It lowers them to calls the 
appropriate functions. But it does so in a way consistent with how the 
operators are supposed to work. Preincrement and Postincrement are 
fundamentally different. D makes the wise choice of making it so that you 
simply overload increment and then has the compiler use that function in a 
manner that generates code which is preincrementing or postincrementing 
depending on which operator was used. This guarantees that preincrement and 
postincrement are consistent. This is in contrast to C++ where you could make 
them do totally different things. Not only does that avoid weird bugs, but it 
makes it so that the compiler can generate more efficient code too. This is 
because postincrementing generates a temporary to save the original value for 
the expression where it's being used whereas preincrement does not, and if the 
compiler knows that preincrement and postincrement are semantically the same, 
then it can replace postincrement with preincrement when it doesn't matter 
which is called. In D, because you overload _one_ operator, the compiler knows 
this for user-defined types, but in C++, it doesn't, and can't make that 
optimization. So, in C++, code like

for(vector<int>::iterator i = v.begin(), e = v.end; i != e; i++) {}

is stuck creating a temporary for every call to i++, whereas in D, it can be 
replaced with ++i. Similarly, in D, >, >=, <=, and < are all translated to 
calls to opCmp, making it so that you overload one function but get 4 
operators.

There are cases where it would just be broken for the compiler to simply call 
your overloaded operator function without doing extra stuff to ensure that it 
acted like the built-in operators (incrementing being a prime example). So no, 
it's _not_ simply a matter of calling your overloaded operator functions. It's 
just that part of the process of compiling code using overloaded operators is 
to translate it to code which involves calling the overloaded operator 
functions. That translation may or may not be direct. TDPL makes a point about 
it to show that the compiler is able to translate the overloaded operators 
into function calls and the compile those instead of having to go to all of 
the extra effort required to deal with fully compiling the overloaded operators 
directly. It's just much simpler to turn one language construct into another, 
existing language construct, and then compile that rather than having to 
understand how to compile both. The same happens with other language 
constructs as well (e.g. scope statements).

TDPL _never_ says that syntactic sugar is applicable to lowered code. Lowering 
code is effectively an implementation detail of the compiler that makes its 
life easier. It does _not_ make it so that one language construct will be 
translated into another where it will then be assumed that the new language 
construct is using syntactic sugar such as UFCS, because _all_ of that 
syntactic sugar must be lowered to code which _isn't_ syntactic sugar anymore. 
It would be far more expensive to have to continually make passes to lower 
code over and over again until no more lowering was required than it would be 
to just have to lower it once.

You're reading way to much into what TDPL is saying. It's simply telling you 
about how the compiler goes about translating code which uses operators such 
as +, >, or = into the functions that you used to overload them. It's _not_ 
telling you that it'll do UFCS on overloaded operator functions. Heck, 
technically, TDPL never really says that D _has_ UFCS. It talks about the 
member call function syntax for _arrays_ (which D had for ages before it had 
UFCS), not for types in general. It's only very recently that full UFCS has 
been added to the language.

Both overloaded operators and UFCS use lowering to generate different code 
which the compiler then compiles, but they _aren't_ mixed and they will 
_never_ be mixed. If it had _ever_ been intended that it be possible to 
overload operators as free functions, then we'd simply have made it so that 
you could declare a free function like

auto opBinary(string op)(Foo foo, Bar bar)
{
    ...
}

in the first place without requiring that it be a member function. But it _was_ 
required to be a member function, and it would make no sense to allow a new 
feature to circumvent that restriction. If it was supposed to be 
circumventable, then the restriction wouldn't have been put there in the first 
place.

- Jonathan M Davis