deprecating std.stream, std.cstream, std.socketstream

Wed May 16 13:38:54 PDT 2012

On 05/16/12 21:38, H. S. Teoh wrote:
> On Wed, May 16, 2012 at 12:48:49PM -0500, Andrei Alexandrescu wrote:
>> On 5/16/12 12:34 PM, Steven Schveighoffer wrote:
>>> In other words, ranges aren't enough.
>>
>> This is copiously clear to me, but the way I like to think about it
>> is by extending the notion of range (with notions such as e.g.
>> BufferedRange, LookaheadRange, and such) instead of developing an
>> abstraction independent from ranges and then working on stitching
>> that with ranges.
> [...]
> 
> One direction that _could_ be helpful, perhaps, is to extend the concept
> of range to include, let's tentatively call it, a ChunkedRange.
> Basically a ChunkedRange implements the usual InputRange operations
> (empty, front, popfront) but adds the following new primitives:
> 
> - bool hasAtLeast(R)(R range, int n) - true if underlying range has at
>   least n elements left;
> 
> - E[] frontN(R)(R range, int n) - returns a slice containing the front n
>   elements from the range: this will buffer the next n elements from the
>   range if they aren't already; repeated calls will just return the
>   buffer;
> 
> - void popN(R)(R range, int n) - discards the first n elements from the
>   buffer, thus causing the next call to frontN() to fetch more data if
>   necessary.
> 
> These are all tentative names, of course. But the idea is that you can
> keep N elements of the range "in view" at a time, without having to
> individually read them out and save them in a second buffer, and you can
> advance this view once you're done with the current data and want to
> move on.
> 
> Existing range operations like popFrontN, take, takeExactly, drop, etc.,
> can be extended to take advantage of the extra functionality of
> ChunkedRanges. (Perhaps popFrontN can even be merged with popN, since
> they amount to the same thing.)
> 
> Using a ChunkedRange allows you to write functions that parse a
> particular range and return a range of chunks (say, a deserializer that
> returns a range of objects given a range of bytes).
> 
> Thinking on it a bit further, perhaps we can call this a WindowedRange,
> since it somewhat resembles the sliding window protocol where you keep a
> "window" of sequential packet ids in an active buffer, and remove them
> from the buffer as they get ack'ed (consumed by popN). The buffer thus
> acts like a "window" into the next n elements in the range, which can be
> "slid forward" as data is consumed.

Right now, everybody reinvents this, with a slightly different interface...
It's really obvious, needed and just has to be standardized.

A few notes:

hasAtLeast is redundant as it can be better expressed as .length; what would
be the point of wrapping 'r.length>=n'? An '.available' property would be
useful to find eg out how much can be consumed w/o blocking, but that one 
should return a size_t too.

'E[] frontN' should have a version that returns all available elements; i 
called it '@property E[] fronts()' here. It's more efficient that way and
doesn't rely on the compiler to inline and optimize the limit checks away.

PopN -- well, its signature here is 'void popFronts(size_t n)', other than
that, there's no difference.

Similar things are necessary for output ranges. Here, what i needed was:

   void put(ref E el)
   void puts(E[] els)
   @property size_t free() // Not the most intuitive name w/o context;
                           // returns the number of E's that can be 'put()'
                           // w/o blocking.                            

Note that all of this doesn't address the consume-variable-sized-chunks issue.
But that can now be efficiently handled by another layer on top.

On 05/16/12 22:15, Steven Schveighoffer wrote:
> I still don't get the need to "add" this to ranges.  The streaming API works fine on its own.

This is not an argument against a streaming API (at least not for me), but
for efficient ranges. With the API above I can shift tens of gigabytes of
data per second between threads. And still use the 'std' range API and
everything that works with it...

> But there is an omission with your proposed API regardless -- reading data is a mutating event.  It destructively mutates the underlying data stream so that you cannot get the data again.  This means you must double-buffer data in order to support frontN and popN that are not necessary with a simple read API.
> 
> For example:
> 
> auto buf = new ubyte[1000000];
> stream.read(buf);
> 
> does not need to first buffer the data inside the stream and then copy it to buf, it can read it from the OS *directly* into buf.

Sometimes having the buffer managed by 'stream' and 'read()' returning a slice
into it works (this is what 'fronts' above does). Reusing a caller managed
buffer can be useful in other cases, yes. 

artur