opCmp / opEquals do not actually support partial orders

Tue Jul 17 18:21:26 UTC 2018

As we know, when opCmp is defined for a user type, then opEquals must
also be defined in order for == to work, even though in theory the
compiler could translate x==y into x.opCmp(y)==0.

In the past, it was argued that this was so that partial orders could be
made to work, i.e., it could be the case that x and y are incomparable,
then x.opCmp(y) would return 0, and x.opEquals(y) would be false.
Supposedly, this would allow us to, say, model the subset relation
(which is a partial order) using the comparison operators.

However, this supposition is in fact false.  The reason is that `<=` and
`>=` *only* check the return value of opCmp(); they do not check
opEquals at all.  So suppose we define a Set type, and we'd like to use
the comparison operators to model the subset relation.  How would we
define opCmp and opEquals?

opEquals is obvious, of course.  It's true if x and y have the same
elements, false otherwise.

But opCmp turns out to be a tarpit.  Here's why:

- If x is a (strict) subset of y, then obviously we should return -1.

- If x is a (strict) superset of y, then obviously we return 1.

- If x and y have the same elements, then obviously we should return 0,
  since we'd like x <= y and x >= y to be true when x == y is true.

- If x and y are not subsets of each other, then what should we return?
  According to the original claim, it should also return 0, for
  "incomparable".  However, this leads to problems:

  - Since `x <= y` in D means `x.opCmp(y) <= 0`, this will return TRUE
    when x and y are incomparable.

  - Similarly, `x >= y` in D means `x.opCmp(y) >= 0`, so it will also
    return TRUE when x and y are incomparable.

  - Therefore, `x <= y` cannot distinguish between x being a non-strict
    subset of y vs. x and y being incomparable.  Similarly for `x >= y`.

So we have the counterintuitive semantics that <= and >= will return
true for non-comparable sets.

There is no return value of opCmp for which both <= and >= will be
false, as we need it to be if we are to map <= to ⊆ and >= to ⊇.

Furthermore, this situation cannot be remedied by redefining < and > to
be ⊆ and ⊇ (as we might try to do in order to work around <= and >= not
checking the return value of opEquals), because when x == y, then there
is no return value that opCmp could return that would make `x < y` and
`x > y` both true simultaneously.

IOW, with the current semantics of opEquals and opCmp, it is impossible
to map the semantics of ⊆ and ⊇ to the comparison operators, in spite of
the fact that we allow opCmp() to return 0 when opEquals returns false.

Conclusion: the claim that opCmp/opEquals could be made to support
partial orders is patently false.

In fact, it cannot even be made to model NaN's in floating-point
arithmetic, which is a mostly-linear ordering, because there is no way
to make <= and >= both false using opCmp() when NaN's are involved
(e.g., in a user-defined floating-point wrapper type).  The best you can
get is that `x <= NAN` and `x >= NAN` is always true.

Corollary: allowing opCmp()==0 to be inconsistent with opEquals()==true
is useless, since we cannot actually use it to support any meaningful
ordering that isn't a linear ordering.  Thus, it only serves as a source
of confusion, and should be made illegal in the language.

T

-- 
EMACS = Extremely Massive And Cumbersome System