Streaming library

Wed Oct 13 07:32:15 PDT 2010

On 10/11/2010 07:49 PM, Daniel Gibson wrote:
> Andrei Alexandrescu schrieb:
>> Agreed. Maybe this is a good time to sart making a requirements list
>> for streams. What are the essential features/feature groups?
>>
>> Andrei
>
> Maybe something like the following (I hope it's not too extensive):
>
> * Input- Output- and InputAndOutput- Streams
> - having InputStream and OutputStream as an interface like in the old
> design may be a good idea
> - implementing the standard operations that are mostly independent from
> the data source/sink
> like read/write for basic types, strings, ... in mixin templates is
> probably elegant to create
> streams that are both Input and Output (one mixin that implements most
> of InputStream and
> one that implements most of OutputStream)

So far so good. I will point out, however, that the classic read/write 
routines are not all that good. For example if you want to implement a 
line-buffered stream on top of a block-buffered stream you'll be forced 
to write inefficient code.

Also, a requirement that I think is essential is separation between 
formatting and transport. std.stream does not have that. At the top 
level there are two types of transport: text and binary. On top of that 
lie various formatters.

> * Two kinds of streams:
> 1. basic streams: reading/writing from/to:
> * network (socket)
> * files
> * just memory (currently MemoryStream)
> * Arrays/Ranges?
> * ...
> 2. streams wrapping other streams:
> * for buffering - buffer input/output/both
> - with the possibility to peek?
> * to modify data when it's read/written (e.g. change endianess -
> important for networking!)
> * custom streams.. e.g. could parse/create CSV (comma seperated values)
> data or similar

Would these be streams be different in their interface?

> * Also there are different types of streams: seekable, resettable (a
> network stream is neither), ...

Agreed. Question: is there a file system that offers resettable but not 
seekable files? I'm thinking of collapsing the two together.

> * functionality/methods needed/desirable:
> - low level access
> * void read(void *buf, size_t len) // read *exactly* len bytes into buf
> * void write(void *buf, size_t len) // write *exactly* len bytes from
> buf to stream
> - convenient methods to read/write basic types in binary (!) from/to stream

Again, binary vs. text is a capability of the stream. For example, a tty 
can never transport binary data - programs like gzip refuse to write 
binary data to a terminal. (Then of course a binary stream can always 
accommodate text data.)

> * <type> read<Type>() (like int readInt()) or T read(T)() (like int
> read!int())

Templates will be difficult for a class hierarchy.

> - with enforcing T is somehow basic (certainly no Object or pointer)
> - could use read(void *buf, size_t len) like in old implementation
> * void write( <basic type> val ) or void write(T)( T val ) - again T
> should be basic type
> - could use write(void *buf, size_t len) like in old implementation
> - convenient methods to read/write arrays of T (T should again be a
> basic type)
> * T[] readArray(T)( size_t len) // return array of T's containing len T's
> - probably both alternatives make sense - the first one to write into an
> existing
> array (-slice), the second one for convenience if you want a new array
> anyway
> * void read(T)( T[] array ) // read array.length T's into array
> - maybe name this readArray(T)(..) as well for consistency?
> * void writeArray(T)( T[] array )
> - special cases for strings?
> * void writeString(char[] str) // same for wchar and dchar
> - could write str into the stream with its length (as ushort xor uint
> xor ulong,
> _not_ size_t!) prepended
> * char[] readString() // same for wchar and dchar
> - read length of the string and then the string itself that will be
> returned

Many of these capabilities involve template methods. Is a template-based 
approach preferable to a straight class hierarchy? I tend to think that 
in the case of streams, classic hierarchies are most adequate.

> - all that array/string/low level stuff but reading *at most* len (or
> array.length) values
> and returning the amount actually read ( readUpTo() ?)
> * useful e.g. for parsing http (you don't know how long the header is etc)
> * the same for write? don't see much use for that though..
>
> - some way to determine whether the stream
> * is at its definite end (eof on file, socket closed or something like
> that)
> * currently empty (for input stream) - just doing a read() would block ?
>
> - Output streams need flush()
> - for Input streams skip(size_t noBytes) or even skip(T)(size_t noElems)
> may be
> handy to just throw away data we're not interested in without having it
> copied around - especially for non-seekable streams (network..)

OK, that's a good start. Let's toss this back and forth a few times and 
see what sticks.

Andrei