streaming redux

Tue Dec 28 03:09:09 PST 2010

On Tue, 28 Dec 2010 09:02:29 +0200, Andrei Alexandrescu  
<SeeWebsiteForEmail at erdani.org> wrote:

> I've put together over the past days an embryonic streaming interface.  
> It separates transport from formatting, input from output, and buffered  
> from unbuffered operation.
>
> http://erdani.com/d/phobos/std_stream2.html
>
> There are a number of questions interspersed. It would be great to start  
> a discussion using that design as a baseline. Please voice any related  
> thoughts - thanks!

Here are my humble observations:

First of all: was ranges-like duck typing considered for streams? The  
language allows on-demand runtime polymorphism, and static typing allows  
compile-time detection of stream features for abstraction. Not sure how  
useful this is is practice, but it allows some optimizations (e.g. the  
code can be really fast when working with memory streams, due to inlining  
and lack of vcalls).

Also, why should there be support for unopened streams? While a stream  
should be flush-able and close-able, opening and reopening streams should  
be done at a higher level IMO.

> Question: Should we offer an open primitive at this level? If so, what  
> parameter(s) should it take?

I don't see how this would be implemented at the lowest level, taking into  
consideration all the possible stream types (network connections, pipes,  
etc.)

> Question: Should we offer a primitive rewind that takes the stream back  
> to the beginning? That might be supported even by some streams that  
> don't support general seek calls. Alternatively, some streams might  
> support seek(0, SeekAnchor.start) but not other calls to seek.

If seek support is determined at runtime by whether the call throws an  
exception or not, then I see no difference in having a rewind method or  
having non-zero seek throw.

> Question: May we eliminate seekFromCurrent and seekFromEnd and just have  
> seek with absolute positioning? I don't know of streams that allow seek  
> without allowing tell. Even if some stream doesn't, it's easy to add  
> support for tell in a wrapper. The marginal cost of calling tell is  
> small enough compared to the cost of seek.

Does anyone ever use seekFromEnd in practice (except the rare case of  
supporting certain file formats)? seekFromCurrent is a nice commodity, but  
every abstract method increases the burden for implementers.

> Buffered*Transport

I always thought that a perfect stream library would have buffering as an  
additional layer. For example: auto f = new Buffered!FileStream(...);

> abstract interface Formatter;

I'm really not sure about this interface. I can see at most three  
implementations of it (native, high-endian and low-endian variants),  
everything else being too obscure to count. I think it should be  
implemented as static structs instead. Also, having an abstract method for  
each native type is quite ugly for D standards, I'm sure there's a better  
solution.

> Question: Should all formatters require buffered transport? Otherwise  
> they might need to keep their own buffering, which ends up being less  
> efficient with buffered transports.

Ideally buffering would be optional, and constructing a buffer-enabled  
stream should be so easy it'd be an easily adoptable habit (see above).  
Last time I tried to do I/O in Java (or was it C#?) I had to instantiate  
3-4 classes before I could read from a file. D can do better.

> Question: Should we also define putln that writes the string and then an  
> line terminator?

But then you're mixing together text and binary streams into the same  
interface. I don't think this is a good idea.

> Question: Should we define a more involved protocol?

"A more involved protocol" would really be proper serialization. Calling  
toString can work as a commodity, similar to writefln's behavior.

> This final function writes a customizable "header" and a customizable  
> "footer".

What is the purpose of this? TypeInfo doesn't contain the field names, so  
it can't be used for protobuf-like serialization. Compile-time reflection  
would be much more useful.

> Question: Should we pass the size in advance, or make the stream  
> responsible for inferring it?

Code that needs to handle allocation itself can make the small effort of  
writing the lengths as well. A possible solution is to make string length  
encoding part of the interface specification, then the user can read the  
length and the contents separately themselves.

> Question: How to handle associative arrays?

Not a problem with static polymorphism.

-- 
Best regards,
  Vladimir                            mailto:vladimir at thecybershadow.net