Operator overloading through UFCS doesn't work
Jonathan M Davis
jmdavisProg at gmx.com
Sat Oct 13 13:15:29 PDT 2012
On Saturday, October 13, 2012 19:01:26 Tommi wrote:
> Another way to describe my reasoning...
>
> According to TDPL, if var is a variable of a user-defined type,
> then:
> ++var
> gets rewritten as:
> var.opUnary!"++"()
>
> Thus, it would be very logical to assume that it doesn't matter
> whether you write:
> ++var
> ...or, write the following instead:
> var.opUnary!"++"()
> ...because the second form is what the first form gets written to
> anyway.
>
> But, that "very logical assumption" turns out to be wrong.
> Because in D, as it stands currently, it *does* make a difference
> whether you write it using the first form or the second:
>
> struct S
> {
> int _value;
> }
>
> ref S opUnary(string op : "++")(ref S s)
> {
> ++s._value;
> return s;
> }
>
> Now, writing the following compiles and works:
> S var;
> var.opUnary!"++"();
>
> ...while the following doesn't compile:
> S var;
> ++var;
>
> This behavior of the language is not logical. And I don't think
> that logic is a matter of preference or taste.
All that TDPL is telling you is that the compiler uses "lowering" on
overloaded operators to generate their code. It lowers them to calls the
appropriate functions. But it does so in a way consistent with how the
operators are supposed to work. Preincrement and Postincrement are
fundamentally different. D makes the wise choice of making it so that you
simply overload increment and then has the compiler use that function in a
manner that generates code which is preincrementing or postincrementing
depending on which operator was used. This guarantees that preincrement and
postincrement are consistent. This is in contrast to C++ where you could make
them do totally different things. Not only does that avoid weird bugs, but it
makes it so that the compiler can generate more efficient code too. This is
because postincrementing generates a temporary to save the original value for
the expression where it's being used whereas preincrement does not, and if the
compiler knows that preincrement and postincrement are semantically the same,
then it can replace postincrement with preincrement when it doesn't matter
which is called. In D, because you overload _one_ operator, the compiler knows
this for user-defined types, but in C++, it doesn't, and can't make that
optimization. So, in C++, code like
for(vector<int>::iterator i = v.begin(), e = v.end; i != e; i++) {}
is stuck creating a temporary for every call to i++, whereas in D, it can be
replaced with ++i. Similarly, in D, >, >=, <=, and < are all translated to
calls to opCmp, making it so that you overload one function but get 4
operators.
There are cases where it would just be broken for the compiler to simply call
your overloaded operator function without doing extra stuff to ensure that it
acted like the built-in operators (incrementing being a prime example). So no,
it's _not_ simply a matter of calling your overloaded operator functions. It's
just that part of the process of compiling code using overloaded operators is
to translate it to code which involves calling the overloaded operator
functions. That translation may or may not be direct. TDPL makes a point about
it to show that the compiler is able to translate the overloaded operators
into function calls and the compile those instead of having to go to all of
the extra effort required to deal with fully compiling the overloaded operators
directly. It's just much simpler to turn one language construct into another,
existing language construct, and then compile that rather than having to
understand how to compile both. The same happens with other language
constructs as well (e.g. scope statements).
TDPL _never_ says that syntactic sugar is applicable to lowered code. Lowering
code is effectively an implementation detail of the compiler that makes its
life easier. It does _not_ make it so that one language construct will be
translated into another where it will then be assumed that the new language
construct is using syntactic sugar such as UFCS, because _all_ of that
syntactic sugar must be lowered to code which _isn't_ syntactic sugar anymore.
It would be far more expensive to have to continually make passes to lower
code over and over again until no more lowering was required than it would be
to just have to lower it once.
You're reading way to much into what TDPL is saying. It's simply telling you
about how the compiler goes about translating code which uses operators such
as +, >, or = into the functions that you used to overload them. It's _not_
telling you that it'll do UFCS on overloaded operator functions. Heck,
technically, TDPL never really says that D _has_ UFCS. It talks about the
member call function syntax for _arrays_ (which D had for ages before it had
UFCS), not for types in general. It's only very recently that full UFCS has
been added to the language.
Both overloaded operators and UFCS use lowering to generate different code
which the compiler then compiles, but they _aren't_ mixed and they will
_never_ be mixed. If it had _ever_ been intended that it be possible to
overload operators as free functions, then we'd simply have made it so that
you could declare a free function like
auto opBinary(string op)(Foo foo, Bar bar)
{
...
}
in the first place without requiring that it be a member function. But it _was_
required to be a member function, and it would make no sense to allow a new
feature to circumvent that restriction. If it was supposed to be
circumventable, then the restriction wouldn't have been put there in the first
place.
- Jonathan M Davis
More information about the Digitalmars-d-learn
mailing list