Safe Cursors and Ranges

Thu Aug 26 09:18:06 PDT 2010

== Quote from Pillsy (pillsbury at gmail.com)'s article
> These thoughts were inspired by the recent thread, "Retrieving the
> traversed range". I am generally pretty sold on the range concept,
> but I think for some operations some sort of cursor is required.
> However, there are some unsafe cursor-related operations that are
> best avoided.
> However, as far as I can see, the unsafe operations revolve around
> synthesizing a new range out of two cursors. However, you can get
> a lot of the desired functionality associated with finding elements and
> splitting ranges *without* allowing the synthesis of of ranges from
> unrelated cursors. The key to doing this is to have each cursor be permanently
bound to the underlying range which it was derived from.
> If you have this kind of cursor, you only really need it to support two
operations to get a huge amount of benefit:
>      Range before ();
>      Range after ();
> The before () operation just returns a range of the appropriate type (Forward,
Bidirectional, RandomAccess) of all the elements that come before the cursor is
pointing to, while the after () operation returns a range of the appropriate type
with all the elements after the cursor. Given this concept, a cursor really points
at the boundary between elements of a range, if you will.
> If find and related operations return this kind of cursor, all of the operations
involving splitting ranges become trivial. It's completely consistent for either
before () or after () to be empty, and since you can't glue two of these cursors
together, you are always guaranteed to get well-defined ranges out.
> While I think the before () and after () operations are the only ones
> which are strictly necessary for the desiderata I've seen so far, people
> could write their own generic algorithms if cursors supported more of
> the iterator operations, for moving forward (and backward, where appropriate)
and inspecting the element immediately after the cursor (or before, where
appropriate), and if ranges supported operations to return cursors from reasonable
places (like from the front or back, or from a specific indexed elemnt for
RandomAccess ranges).
> The key idea is that these cursors aren't a primitive part of a range; instead,
they take a range and add a position *inside* the range. They're a perfect fit for
all those "three-legged" operations out there because they actually have three legs.
> Cheers,
> Pillsy

I'm starting to think that the whole concept of ranges is starting to get too
complicated.  We're already seeing it with this moveFront() stuff that's a special
case to deal with structs that have expensive postblits.  We need a healthy dose
of "worse is better" to throw at ranges.  IMHO generic code should solve 90% of
the problem in a way that's robust, not terribly hard to implement and easy to use
and understand.  Trying to solve 100% of the problem with generic code means that
your solution will be absurdly complicated in both interface and implementation,
and therefore will be a buggy POS that nobody can figure out how to use.

Adding yet more complexity to ranges has the following negative consequences, such
that I think it's justified to sacrifice completeness in favor of simplicity:

1.  Having yet more things to wrap makes writing correct higher order ranges that
much more difficult and bug-prone.  The composability of ranges is what makes them
such an elegant and useful abstraction in the first place.

2.  The combinatorial explosion of range types (isBidirectional, hasBefore,
hasLength, ...) makes testing a higher order range a nightmare.

3.  The more stuff we require someone to define to create a range (for example, if
before() were simply required for bidirectional ranges), the higher the odds that
we'll run into cases where one of these primitives can't be implemented efficiently.

4.  I remember when ranges first came out, some people were pretty confused by
them.  Adding more stuff to solve the last 10% of cases will only add to this.
It's better to have ranges that can solve 90% of the problem for the most
competent 80% of programmers than ranges that can solve 100% of the problem, but
only for the 1% who are hardcore gurus.