Do sorted ranges have any special properties?

Tue Jul 27 07:05:11 PDT 2010

Peter Alexander wrote:
> Personally, I would just expose all accelerated algorithms for sorted 
> ranges, and make the user call them explicitly e.g. if they know their 
> range is sorted, make them call binarySearch (or whatever) instead of find.
> 
> I have two primary reasons for this:
> 
> 1. Having the function choose the right algorithm puts an undue burden 
> on the function writer, which translates into increased uncertainty for 
> the user of the function, "If I call find(assumeSorted(r)), is it going 
> to use binary search? What if I call SomeThirdParty.find? Will it know 
> about assumeSorted?"
> 
> 2. While we're only discussing assumeSorted now, there are a myriad of 
> other assumeX wrappers that would be useful:
> 
> assumeUnimodal(range) - Allows ternary search for maxElement.
> assumeDistinct(range) - uniq doesn't need to do anything.
> assumePrime(int) - e.g. getFactors(assumePrime(x)) returns [1, x]
> assumeAssociative(binOp) - power can work in O(lg(n)).
> 
> * power!(binOp)(a, b) := reduce!(binOp)(replicate(a, b))
> 
> Do we provide wrappers for these as well, and make the algorithms check 
> for all relevant ones?
> 
> ---
> 
> I say no. Provide ternarySearch, powerAssociative etc., and let the user 
> choose them when he/she knows they are the right algorithm. That way, 
> function writers can focus on writing specialized functions, and 
> function users can rest assured that the function they are calling is 
> performing the algorithm that they expect (with the complexity bounds 
> that they expect).

I am quoting this in full because I agree with it in full! I think this 
and opIn_r settle the matter in favor of making the additional goodies 
members of Sorted!Range.

A few additional thoughts:

- Experience with the STL has shown that "specialized functions for 
improved complexity with specialized syntax" (a la find as a member vs 
nonmember) fare better than "best-effort functions" (a la STL's distance).

- I'm considering allowing assignment to .front, .back, [k] etc. with 
runtime enforcement that the assignment doesn't break the ordering. 
There is a lure to have them actually shuffle the assigned element in 
the right place, but that would break many assumptions. Many algorithms 
assume that after r.front = x they can assert(r.front == x).

- All binary search operators (lowerBound, upperBound, equalRange) 
should be made members of Sorted!Range. So instead of writing:

// We know r is sorted, baby
auto r1 = upperBound(r, x);

you'd write:

auto r1 = assumeSorted(r).upperBound(x);

thus essentially moving the comment into the code. (I also have a cool 
idea for statistically checking sortedness without deteriorating 
complexity. Essentially assumeSorted(r) could check that r is sorted by 
tossing a rigged coin with P(head)=1.0/r.length and deciding to check if 
head comes up. That way, the amortized complexity of the check is O(1)).

This all raises the question: where should this Sorted(R) in the making 
go? It's a range so it should go in std.range, but it's mostly a 
motivator for algorithms, so it should go in std.algorithm. Tiebreakers?

I'm very excited about this development. Maybe I should include some of 
this stuff in my Google talk on Friday.

Andrei