An IO Streams Library

Sun Feb 7 02:50:24 PST 2016

Am Sun, 07 Feb 2016 00:48:54 +0000
schrieb Jason White <54f9byee3t32 at gmail.com>:

> I see the subject of IO streams brought up here occasionally. The 
> general consensus seems to be that we need something better than 
> what Phobos provides.
> 
> I wrote a library "io" that can work as a replacement for 
> std.stdio, std.mmfile, std.cstream, and parts of std.stream:
> 
>      GitHub:  https://github.com/jasonwhite/io
>      Package: https://code.dlang.org/packages/io
> 
> This library provides an input and output range interface for 
> streams (which is more efficient if the stream is buffered). 
> Thus, many of the wonderful range operations from std.range and 
> std.algorithm can be used with this.
> 
> I'm interested in feedback on this library. What is it missing? 
> How can be better?
> 
> I'm also interested in a discussion of what IO-related 
> functionality people are missing in Phobos.
> 
> Please destroy!

I saw this on code.dlang.org some time ago and had a quick look. First
of all this would have to go into phobos to make sure it's used as some
kind of a standard. Conflicting stream libraries would only cause more
trouble.

Then if you want to go for phobos inclusion I'd recommend looking at
other stream implementations and learning from their mistakes ;-)
There's
https://github.com/schveiguy/phobos/tree/babe9fe338f03cafc0fb50fc0d37ea96505da3e3/std/io
which was supposed to be a stream replacement for phobos. Then there
are also vibe.d streams*.

Your Stream interfaces looks like standard stream implementations (which
is a good thing) which also work for unbuffered streams. I think it's a
good idea to support partial reads and writes. For an explanation why
partial reads, see the vibe.d rant below. Partial writes are useful
as a write syscall can be interrupted by posix signals to stop the
write. I'm not sure if the API should expose this feature (e.g. by
returning a partial write on EINTR) but it can sometimes be useful.
Still readExactly / writeAll helpers functions are useful. I would try
to implement these as UFCS functions instead of as a struct wrapper.

For some streams you'll need a TimeoutException. An interesting
question is whether users should be able to recover from
TimeoutExceptions. This essentially means if a read/write function
internally calls read/write posix calls more than once and only the
last one timed out, we already processed some data and it's not
possible to recover from a TimeoutException if the amount of already
processed data is unknown.
The simplest solution is using only one syscall internally. Then
TimeoutException => no data was processed. But this doesn't work for
read/writeExcatly (Another reason why read/writeExactly shouldn't be
the default. vibe.d...)

Regarding buffers / sliding windows I'd have a look at
https://github.com/schveiguy/phobos/blob/babe9fe338f03cafc0fb50fc0d37ea96505da3e3/std/io/buffer.d

Another design question is whether there should be an interface for
such buffered streams or whether it's OK to have only unbuffered
streams + one buffer struct / class. Basically the question is whether
there might be streams that can offer a buffer interface but can't  use
the standard implementation.

* vibe.d stream rant ahead:

vibe.d streams get some things right and some things very wrong. For
example their leastSize/empty/read combo means you might actually
have to implement reading data in any of these functions. Users have to
handle timeouts or other errors for any of these as well.

Then the API requires a buffered stream, it simply won't work for
unbuffered IO (leastSize, empty). And the fact that read reads exactly
n bytes makes stream implementations more complicated (re-reading until
enough data has been read should be done by a generic function, not
reimplemented in every stream). It even makes some user code more
complicated: I've implemented a serial port library for vibe-d.
If I don't know how many bytes will arrive with the next packet, the
read posix function usually returns the expected/available amount of
data. But now vibe.d requires me to specify a fixed length when calling
the stream read method. This leads to ugly code using peak...

Then vibe.d also mixes the sliding window / buffer concept into the
stream class, but does so in a bad way. A sliding window should expose
the internal buffer so that it's possible to consume bytes from the
buffer, skip bytes, refill... In vibe.d you can peak at the buffer. But
you can't discard data. You'll have to call read instead which copies
from the internal buffer to an external buffer, even if you only want
to skip data. Even worse, your external buffer size is limited. So you
have to implement some loop logic if you want to skip more data than
fits your buffer. And all you need is a discard(size_t n) function which
does _buffer = _buffer[n .. $] in the stream class...

TLDR: API design is very important.