buffered input

Fri Feb 4 23:22:47 PST 2011

On 02/04/2011 11:46 PM, Andrei Alexandrescu wrote:
> I've had the opportunity today to put some solid hours of thinking into
> the relationship (better said the relatedness) of what would be called
> "buffered streams" and ranges. They have some commonalities and some
> differences, but it's been difficult to capture them. I think now I have
> a clear view, caused by a few recent discussions. One was the CSV reader
> discussed on the Phobos list; another was the discussion on defining the
> "right" std.xml.
>
> First, let's start with the humblest abstraction of all - an input
> range, which only defines the troika empty/front/popFront with the known
> semantics.
>
> An input range consumes input destructively and has a one-element
> horizon. It may as well considered a buffered stream with the buffer
> length exactly one.
>
> Going from there, we may say that certain streaming can be done by using
> an input range of ubyte (or dchar for text). That would be the
> UTFpowered equivalent of getchar(). The readf function operates that way
> - it only needs to look one character ahead. Incidentally, the CSV
> format also requires lookahead of 1, so it also can operate on a range
> of dchar.
>
> At this point we need to ask ourselves an essential question. Since we
> have this "input range" abstraction for a 1-element buffer, what would
> its n-elements buffer representation look like? How do we go from "input
> range of T" (which really is "unbuffered input range of T" to "buffered
> input range of T"?
>
> Honestly, the answer was extremely unclear to me for the longest time. I
> thought that such a range would be an extension of the unbuffered one,
> e.g. a range that still offers T from front() but also offers some
> additional functions - e.g. a lookahead in the form of a random-access
> operator. I still think something can be defined along those lines, but
> today I came together with a design that is considerably simpler both
> for the client and the designer of the range.
>
> I hereby suggest we define "buffered input range of T" any range R that
> satisfies the following conditions:
>
> 1. R is an input range of T[]
>
> 2. R defines a primitive shiftFront(size_t n). The semantics of the
> primitive is that, if r.front.length >= n, then shiftFront(n) discards
> the first n elements in r.front. Subsequently r.front will return a
> slice of the remaining elements.

Does shiftFront literally move element n to index 0 and so on? It seems 
to me that if you do, its going to have horrid performance, and if you 
don't, then you will eventually run into situations where appendToFront 
will require a wrap around, which loses you your contiguity, or a 
reallocation of the buffer.