stream interfaces - with ranges

Thu May 17 08:46:18 PDT 2012

On 5/17/12 9:02 AM, Steven Schveighoffer wrote:
> 1. We need a buffering input stream type. This must have additional
> methods besides the range primitives, because doing one-at-a-time byte
> reads is not going to cut it.

I was thinking a range of T[] could be enough for a buffered input range.

> 2. I realized, buffering input stream of type T is actually an input range
> of type T[]. Observe:

Ah, there we go :o).

> struct /*or class*/ buffer(T)
> {
> T[] buf;
> InputStream input;
> ...
> @property T[] front() { return buf; }
> void popFront() {input.read(buf);} // flush existing buffer, read next.
> @property bool empty() { return buf.length == 0;}
> }
>
> Roughly speaking, not all the details are handled, but this makes a
> feasible input range that will perform quite nicely for things like
> std.algorithm.copy. I haven't checked, but copy should be able to handle
> transferring a range of type T[] to an output range with element type T,
> if it's not able to, it should be made to work.

We can do this for copy, but if we need to specialize a lot of other 
algorithms, maybe we didn't strike the best design.

> I know at least, an
> output stream with element type T supports putting T or T[].

Right.

> What I think
> really makes sense is to support:
>
> buffer!ubyte b;
> outputStream o;
>
> o.put(b); // uses range primitives to put all the data to o, one element
> (i.e. ubyte[]) of b at a time

I think that makes sense.

> 3. An ultimate goal of the i/o streaming package should be to be able to
> do this:
>
> auto x = new XmlParser("<rootElement></rootElement>");
>
> or at least
>
> auto x = new XmlParser(buffered("<rootElement></rootElement>"));
>
> So I think arrays need to be able to be treated as a buffering streams. I
> tried really hard to think of some way to make this work with my existing
> system, but I don't think it will without unnecessary baggage, and losing
> interoperability with existing range functions.

I think we can create a generic abstraction buffered() that layers 
buffering on top of an input range. If the input range has unbuffered 
read capability, buffered() would use those. Otherwise, it would use 
loops using empty, front, and popFront.

> Where does this leave us?
>
> 1. I think we need, as Andrei says, an unbuffered streaming abstraction.
> I think I have this down pretty solidly in my current std.io.

Great. What are the primitives?

> 2. A definition of a buffering range, in terms of what additional
> primitives the range should have. The primitives should support buffered
> input and buffered output (these are two separate algorithms), but
> independently (possibly allowing switching for rw files).

Sounds good.

> 3. An implementation of the above definition hooked to the unbuffered
> stream abstraction, to be utilized in more specific ranges. But by
> itself, can be used as an input range or directly by code.

Hah, I can't believe I wrote about the same thing above (and I swear I 
didn't read yours).

> 4. Specialization ranges for each type of input you want (i.e. byLine,
> byChunk, textStream).

What is the purpose? To avoid unnecessary double buffering?

> 5. Full replacement option of File backend. File will start out with
> C-supported calls, but any "promotion" to using a more D-like range type
> will result in switching to a D-based stream using the above mechanisms.
> Of course, all existing code should compile that does not try to assume
> the File always has a valid FILE *.

This will be tricky but probably doable.

Andrei