templating opEquals/opCmp (e.g. for DSL/expression templates)

Wed Feb 13 01:50:21 UTC 2019

On Wednesday, 13 February 2019 at 00:56:48 UTC, H. S. Teoh wrote:
> Frankly, I think it's a very good thing that in D comparison 
> operators are confined to opEquals/opCmp.

So do I. Many more objects have partial ordering than arithmetic, 
having opCmp under opBinary would be annoying.

> If anything, I would vote for enforcing opEquals to return bool 
> and bool only.

That would be a backwards incompatible change, like it or not.

> The reason for this is readability and maintainability.  
> Symbols like <= or == should mean one and only one thing in the 
> language, and should not be subject to random overloaded 
> interpretations. Being built-in operators, they are used 
> universally in the language, and any inconsistency in semantics 
> hurts readability, comprehension, and maintainability, such as 
> C++'s free-for-all operator overloading, where any piece of 
> syntax can have wildly-divergent interpretations (even 
> completely different parse trees) depending on what came 
> before.  For example, recently I came up with a C++ monstrosity 
> where the lines:
>
>         fun<A, B>(a, b);
>         gun<T, U>(a, b);
>
> have wildly-different, completely unrelated *parse trees*, as 
> an illustration of how unreadable C++ can become.
>
> Yes, it's extremely flexible, yes it's extremely powerful and 
> can express literally *whatever* you want it to express. It's 
> also completely unreadable and unmaintainable, because the 
> surface structure of the code text becomes completely detached 
> from the actual underlying semantics. I don't even want to 
> imagine what debugging such kind of code must be like.

Thank goodness we use ! for template and don't have `,` available 
for overloading!

> Operator overloading should be reserved for objects that behave 
> like arithmetical entities in some way.

Like symbolic math.

> And especially comparison operators should not have any other 
> meaning than the standard meaning.  It should be illegal to 
> make == and <= mean something completely unrelated to each 
> other.

They do already mean something completely different, <= is an 
ordering, == is equality. Yes it would be bad for (a <= b) == (a 
== b) to be false. I'm sure you could already achieve that 
outcome, would you though? Of course not, it'd be stupid.

> If you need to redefine comparison operators, what you want is 
> really a DSL wherein you can define operators to mean whatever 
> you want.

Yes.

> The usual invocation of such a DSL as a compile-time argument, 
> say something like this:
>
> 	myDsl!'a <= b'
>
> contains one often overlooked, but very important element: the 
> identifier `myDsl`, that sets it apart from other uses of `<=`, 
> and clearly identifies which interpretation should be ascribed 
> to the symbols found in the template argument.

Its also much uglier and does not commute with things that use <= 
i.e. generic functions.

> See, the thing is, when you see a random expression with 
> arithmetical operators in it, the expected semantics is the 
> computation of some kind of arithmetic objects producing an 
> arithmetical result -- because that's what such expressions 
> mean in general, in the language.  It's abusive to overload 
> that to mean something else entirely -- because there is no 
> warning sign to the reader of the code that something different 
> is happening.

Yes thats the use/abuse distinction, see my other post.

When you see a line like:
>
>         fun<A, B>(a, b);
>
> and then somewhere else a line like:
>
>         gun<T, U>(a, b);
>
> the actual behaviour of the code should be similar enough that 
> you can correctly guess the semantics.  It should not be that 
> the first line instantiates and calls a template function, 
> whereas the second is evaluated as an expression with a bunch 
> of overloaded comma and comparison operators.

Indeed! That is not what is being proposed at all!

> Similarly, it should not be the case that:
>
> 	auto x = a <= b;
>
> evaluates a comparison expression and assigns a boolean value 
> to x, whereas:
>
> 	auto y = p <= q;
>
> creates an expression object capturing p and q, that needs to 
> be called later before it yields a boolean value.

No. auto y = p <= q; should not e.g. open a socket (you could 
probably do that already with an impure opCmp). Being able to 
capture the expression `p <= q` is the _entire point_ of the 
proposal.

> With a string DSL, that little identifier `myDsl` (or whatever 
> identifier you choose for this purpose) serves as a cue to the 
> reader of the code that something special is happening here.  
> For example:
>
> 	auto y = deferred!`p <= q`;
>
> immediately tells the reader of the code that the <= is to be 
> understood with a different meaning than the usual <= operator.

Its also much uglier and does not commute with things that use <= 
i.e. generic functions.

> Just as an expression like:
>
> 	auto dg = (p, q) => p <= q;
>
> by virtue of its different syntax tells the reader that the 
> expression `p <= q` isn't being evaluated here and now, as it 
> otherwise would be.

Can't do symbolic computation with that.

> The presence of such visual cues is good, and is the way things 
> should be done.
>
> It should not be that something that looks like an expression, 
> evaluated here and now, should turn out to do something else. 
> That kind of free-for-all, can-mean-literally-anything 
> semantics makes code unreadable, unmaintainable, and a ripe 
> breeding ground for bugs -- someone (i.e., yourself after 3 
> months) will inevitably forget (or not know) the special 
> meaning of <= in this particular context and write wrong code 
> for it.

Type autocompletion will tell you the result of  p <= q; which at 
that point if it is still unclear you have bigger problems. In 
generic code you have no choice but to assume that p <= q; is a 
comparison, if someone is using that with a symbolic engine then 
the meaning doesn't change.

> P.S. And as a bonus, a string DSL gives you the freedom to 
> employ operators not found among the built-in D operators, for 
> example:
>
> 	auto result = eval!`(f∘g)(√x ± √y)`;
>
> And if you feel the usual strings literals are too cumbersome 
> to use for long expressions, there's always the under-used 
> token strings to the rescue:
>
> 	auto result = eval!q{
> 		( (f ∘ g)(√(x + y) ± √(x - y)) ) / |x|·|y| +
> 		2.0 * ∫ f(x)·dx
> 	};
>
> The `eval` tells the reader that something special is happening 
> here, and also provides a name by which the reader can search 
> for the definition of the template that processes this 
> expression, and thereby learn what it means.
>
> Without this little identifier `eval`, it would be anyone's 
> guess as to what the code is trying to do.

I would expect it the compute the tuple
(f(g(sqrt(x+y)) + sqrt(x-y)/(abs(x).dot(abs(y)) + 2*integrate(f),
(f(g(sqrt(x+y)) - sqrt(x+y)/(abs(x).dot(abs(y)) + 2*integrate(f)

(you missed the bounds on the integral and x is ambiguous in the 
integral)

What else would I think it would do? If that guess is wrong then 
the person has abused the operators, if its correct that thats a 
win. I'd bet money you could do just that in Julia. I'm not 
suggesting we go that far but they would definitely consider that 
a feature.

> Throw in C++-style SFINAE and Koenig lookup, and what ought to 
> be a 10-second source tree search for an identifier easily 
> turns into a 6-hour hair-pulling session of trying to 
> understand exactly which obscure rules the C++ compiler applied 
> to resolve those operators to which symbol(s) defined in which 
> obscure files buried deep in the source tree.

Its a good thing we don't have SFINAE and Koenig lookup.