challenge: implement the max function

Wed Jan 24 22:32:08 PST 2007

Andrei Alexandrescu (See Website For Email) wrote:
> Frits van Bommel wrote:
>> Andrei Alexandrescu (See Website For Email) wrote:
>>> Here's a simple challenge: implement the max function. Requirements:
>>>
>>> a) generic
>>> b) efficient
>>> c) preserve lvalueness when possible such that one can write e.g.
>>>
>>> max(arr[0], arr[1]) *= 0.9;
>>>
>>> d) should accept two or more arguments
>>> e) should return the "smartest" type, e.g. max of an unsigned int and 
>>> unsigned long should return unsigned long
>>> f) short and easy to understand
>>>
>>> I don't think it's possible to implement the function to the spec in 
>>> current D, so designs are allowed to invent new features, as long as 
>>> they define them thoroughly.
...
> Heck, I did it in C++. It takes some hundred plus lines of code though 
> :o). The obvious purpose of my challenge is to find what it takes to 
> make it a trivial few-liner in D.
> 
> 
> Andrei

Here's my late entry:

alias max_type(T, U) {
   static if (T.max > U.max) {
     alias T max_type;
   } else {
     alias U max_type;
   }
}

max_type(T, U) ALIAS max(T, U)(inout T x, inout U y)
{
     alias max_type(T,U) MT;
     MT x1 = x, y1 = y;

     if (x1 >= y1) {
         static if (is (MT == T)) {
             return alias x;
         } else {
             return x1;
         }
     } else {
         static if (is (MT == U)) {
             return alias y;
         } else {
             return y1;
         }
     }
}

The new feature is "return alias x", which is just syntax for saying 
that you are returning a reference to the input parameter.

Note that "ALIAS" should be replaced by some keyword -- or maybe remvoed 
entirely, since it can be deduced easily as described below.  It should 
be safe to remove it entirely since we can deduce it from the I was 
going to use "alias" but it looks too much like If D ever gets a keyword 
to indicate the opposite of C++'s "strict" keyword, I would think it 
could be used here.

A. Generic

As long as "U + T" is legal, I think this could work.  A better choice 
of the return value could be made if you expect "char[]" to be used, but 
otherwise, I think it's a pretty generic solution.  If constants are 
passed into the function, the "static if" needs to fail somehow.

B. Efficient

Consider this simple saltshaker...

One way this could work from a stack packing point of view, is to 
imagine that the code actually returns this object:

struct real_return_t {
     max_type(T,U) * _proxy;
     max_type(T,U)   _data;
}

Then, plain old "return XYZ" does this:

  real_return_t rv;
  rv._data  = XYZ;
  rv._proxy = & rv._data;

But the "return alias XYZ" does this:

  real_return_t rv;
  rv._proxy = & x; // input arg

In the calling code, the return value of max() is the dereference of 
rv._proxy, so that  max(a, b)  becomes  *(max(a, b).proxy), thus 
allowing the assignment.

Optimization:

1. If T == U, then both returns end up being "return alias", which means 
that the caller never uses the 'rv._data' member; it can be eliminated 
and the function becomes T * max(...), with automagic dereferencing of 
the return value.

2. If typeof(T + U) is neither T nor U (think byte + uint -> int), then 
the rv.data field is always used, which means the return value is always 
equivalent to "*& rv._data".  This means that the member rv._proxy can 
be optimized away.  Thus, it could becomes normal int max(...).

3. If T == MT but T != U, or vice versa, then the real_return_t object 
cannot be removed, but we essentially have a variant return type, and a 
hand-written implementation would normally use the same technique to 
return either the input primitive or another one.

So, in cases 1-3, I think it is as efficient as we could expect from a 
hand-written job.

C. Lvalueness

I think so.  This preserves it when possible, gets rid of it when not, 
and does so in a way that is highly readable.  NOTE that this solution 
preserves lvalue-ness above-and-beyond because for (int x, long y), it 
preserves it for y even though it can't preserve it for x.

D. Two or More arguments

I didn't code up this part but my gut reaction solution is to use a 
static foreach (do those exist yet?) in the max() function and some kind 
of recursive template dance in the max_type template.

E. I think using "+" is enough for this if classes are not involved.  If 
they are then I don't think deducing the max of two classes is possible 
without cooperation from them.  However, one could try to assign in both 
directions and see which assignment "hurts less" with something like SFINAE.

F. As long as you don't count this post... but the code looks tidy.

------

Solution Two

NOTE: This is not a C++ "&", see below.

template max(T, U) {
   static if (T == U) {
     T & max(T & x, U & y)
     {
       if (*x >= *y) {
         return x;
       } else {
         return y;
       }
     }
   } else {
     max_type(T, U) max(T, U)(T x, U y)
     {
       alias max_type(T,U) MT;

       if (x >= y) {
         return x;
       } else {
         return y;
       }
     }
   }
}

The "&" here is kind of like C++'s "&" type, but you can't use it except 
in function interfaces.  In the context of a function argument, it means 
that the caller of the function is actually passing "& x" when they say 
"x", and in the case of a return value, the caller thinks they get 
something like "T x", but it they actually get a "T * x", and when they 
use it, the "x" turns into "*x".

Inside the function body, no magic translations are done, the writer of 
the function use actual pointers to do the work to make it harder to 
accidentally return addresses of local variables.

I think this satisfies A B C E F, though not as heroicly with C as the 
Solution One (which may be a good thing.)

Kevin