Do sorted ranges have any special properties?
Andrei Alexandrescu
SeeWebsiteForEmail at erdani.org
Tue Jul 27 07:05:11 PDT 2010
Peter Alexander wrote:
> Personally, I would just expose all accelerated algorithms for sorted
> ranges, and make the user call them explicitly e.g. if they know their
> range is sorted, make them call binarySearch (or whatever) instead of find.
>
> I have two primary reasons for this:
>
> 1. Having the function choose the right algorithm puts an undue burden
> on the function writer, which translates into increased uncertainty for
> the user of the function, "If I call find(assumeSorted(r)), is it going
> to use binary search? What if I call SomeThirdParty.find? Will it know
> about assumeSorted?"
>
> 2. While we're only discussing assumeSorted now, there are a myriad of
> other assumeX wrappers that would be useful:
>
> assumeUnimodal(range) - Allows ternary search for maxElement.
> assumeDistinct(range) - uniq doesn't need to do anything.
> assumePrime(int) - e.g. getFactors(assumePrime(x)) returns [1, x]
> assumeAssociative(binOp) - power can work in O(lg(n)).
>
> * power!(binOp)(a, b) := reduce!(binOp)(replicate(a, b))
>
> Do we provide wrappers for these as well, and make the algorithms check
> for all relevant ones?
>
> ---
>
> I say no. Provide ternarySearch, powerAssociative etc., and let the user
> choose them when he/she knows they are the right algorithm. That way,
> function writers can focus on writing specialized functions, and
> function users can rest assured that the function they are calling is
> performing the algorithm that they expect (with the complexity bounds
> that they expect).
I am quoting this in full because I agree with it in full! I think this
and opIn_r settle the matter in favor of making the additional goodies
members of Sorted!Range.
A few additional thoughts:
- Experience with the STL has shown that "specialized functions for
improved complexity with specialized syntax" (a la find as a member vs
nonmember) fare better than "best-effort functions" (a la STL's distance).
- I'm considering allowing assignment to .front, .back, [k] etc. with
runtime enforcement that the assignment doesn't break the ordering.
There is a lure to have them actually shuffle the assigned element in
the right place, but that would break many assumptions. Many algorithms
assume that after r.front = x they can assert(r.front == x).
- All binary search operators (lowerBound, upperBound, equalRange)
should be made members of Sorted!Range. So instead of writing:
// We know r is sorted, baby
auto r1 = upperBound(r, x);
you'd write:
auto r1 = assumeSorted(r).upperBound(x);
thus essentially moving the comment into the code. (I also have a cool
idea for statistically checking sortedness without deteriorating
complexity. Essentially assumeSorted(r) could check that r is sorted by
tossing a rigged coin with P(head)=1.0/r.length and deciding to check if
head comes up. That way, the amortized complexity of the check is O(1)).
This all raises the question: where should this Sorted(R) in the making
go? It's a range so it should go in std.range, but it's mostly a
motivator for algorithms, so it should go in std.algorithm. Tiebreakers?
I'm very excited about this development. Maybe I should include some of
this stuff in my Google talk on Friday.
Andrei
More information about the Digitalmars-d
mailing list