// Written in the D programming language. /** Streams are structured in two layers. At the bottom there's the transport layer, which is responsible for opening and closing a stream, positioning in the stream, and transferring bytes. Atop of the transport layer sits the formatting layer, which is concerned with formatting typed data into raw bytes which then are passed to the underlying transport. Macros: WIKI = Phobos/StdAlgorithm QUESTION = $(I Question: $0) Copyright: Andrei Alexandrescu 2010-. License: $(WEB boost.org/LICENSE_1_0.txt, Boost License 1.0). Authors: $(WEB erdani.com, Andrei Alexandrescu) */ module std.stream2; import std.variant; /** The base transport interface $(D TransportBase) supports primitives for checking whether the transport is opened, closing the transport, and positioning in the stream. Opening is not part of this interface; it is assumed that a factory function opens the transport with the appropriate parameters. Some streams may not actually be positionable, in which case the positioning primitives throw. $(QUESTION Should we offer an $(D open) primitive at this level? If so, what parameter(s) should it take?) $(QUESTION Should we offer a primitive $(D rewind) that takes the stream back to the beginning? That might be supported even by some streams that don't support general $(D seek) calls. Alternatively, some streams might support $(D seek(0, SeekAnchor.start)) but not other calls to $(D seek).) */ interface TransportBase { /** Positions the stream $(D position) bytes from the beginning, returns the new absolute _position. Throws on error. */ ulong seek(ulong position); /** Seeks the stream $(D position) bytes from stream's current position, returns the new absolute _position. Throws on error. */ ulong seekFromCurrent(long position); /** Seeks the stream $(D position) bytes from stream's end, returns the new absolute _position. Throws on error. The semantics of this primitive for $(D position > 0) are defined by the stream implementation (e.g. on certain file systems, such calls may allow writing sparse files). $(QUESTION May we eliminate $(D seekFromCurrent) and $(D seekFromEnd) and just have $(D seek) with absolute positioning? I don't know of streams that allow $(D seek) without allowing $(D tell). Even if some stream doesn't, it's easy to add support for $(D tell) in a wrapper. The marginal cost of calling $(D tell) is small enough compared to the cost of $(D seek).) */ ulong seekFromEnd(long position); /** Returns the absolute position in the stream. Throws on error. */ ulong tell() const; /** Returns whether the stream is at its logical end. Subsequent reads from the stream will fail, and subsequent writes to the stream will add new data. */ @property bool atEnd() const; /** Is this stream open? */ @property bool isOpen() const; /** Close the stream. Does nothing on an unopened stream. Throws on error. $(QUESTION Should this throw on an unopened stream? I don't think so, because throwing does not offer any additional information that user code didn't have, and the idiom $(D if (s.isOpen) s.close()) is verbose and frequently encountered.) */ void close(); } /** Unbuffered transport interfaces hold no buffers of their own and therefore rely on user-supplied buffers to do their deed. */ interface UnbufferedInputTransport : TransportBase { /** Reads data off the stream and returns the data _read (which is a slice of $(D buffer)). If this function returns zero, the stream has become empty. Reading from a stream that is $(D atEnd) just returns empty slices. If the stream is closed or some error occurs during reading, an exception is thrown. $(QUESTION Should we allow $(D read) to return an empty slice even if $(D atEnd) is $(D false)? If we do, we allow non-blocking streams with burst transfer. However, naive client code on non-blocking streams will be inefficient because it would essentially implement busy-waiting.) */ ubyte[] read(ubyte[] buffer); } /** Unbuffered output transport offers one primitive for writing. Client code should never assume that unbuffered writes in fact go straight to the hardware support of the stream. This is because of at least two factors. First, the underlying operating system-specific primitives might not offer guaranteed write-through (which is e.g. the case for Linux unbuffered files). Second, $(D BufferedOutputTransport) (below) inherits $(D UnbufferedOutputTransport) to offer guaranteed buffering. So $(D UnbufferedOutputTransport) is best understood as "transport without guaranteed buffering". */ interface UnbufferedOutputTransport : TransportBase { /** Writes data to the stream. Throws on error. */ void write(in ubyte[] buffer); /** Alias for $(D write) that supports the output range interface. */ alias write put; } /** Buffered transport interfaces hold internal buffers as intermediaries between the data source and client code. The $(D BufferedOutputTransport) interface is formally an input range of $(D ubyte[]), which means it can be used directly with a variety of algorithms. */ interface BufferedInputTransport : UnbufferedInputTransport { /** Alias for $(D atEnd) for compliance with the input range interface. */ alias atEnd empty; /** If the internal buffer is not empty, returns the already-buffered data, which user code may inspect or copy as it finds fit. No reading from the stream is made. If there is no already buffered data, makes sure more data is input off the stream. The amount of data read depends on the actual stream. $(QUESTION Should we allow an empty _front on a non-empty stream? This goes back to handling non-blocking streams.) */ @property ubyte[] front(); /** Discards the existing buffer, reads a new buffer. */ void popFront(); /** Peeks $(D n) bytes forward in the stream. The buffer returned may be shorter than $(D n) only in case the stream has ended. Following a call $(D peek(n)), $(D front) will yield the same buffer. */ ubyte[] peek(size_t n); /** Discards $(D n) bytes off the stream. Returns the number of bytes discarded, which may be less than $(D n) if and only if the stream has ended. The stream need not be seekable. $(QUESTION Should we eliminate this function? Theoretically calling $(D advance(n)) is equivalent with $(D seekFromCurrent(n)). However, in practice a file-based stream will have to implement $(D advance) even though the underlying file is not seekable.) */ ulong advance(ulong n); } /** Buffered transport interfaces hold internal buffers as intermediaries between the data source and client code. The $(D BufferedOutputTransport) interface is formally an output range of $(D ubyte[]), which means it can be used with a variety of algorithms directly. */ interface BufferedOutputTransport : UnbufferedOutputTransport { /** Normally data may not be written immediately. $(D flush) makes sure that buffers are actually written to the stream. It is up to the stream to ensure that data is written to its actual destination (e.g. disk). */ void flush(); } /** The $(D Formatter) interface is concerned with formatting typed objects into bytes. The resulting bytes are passed to a backend transport object. */ interface Formatter { /** Gets and sets the underlying _transport object. Each formatter is associated with one _transport object and forwards to it the bytes to be read after formatting. It is an error to attempt writes to a $(D Formatter) that has a $(D null) _transport. Also, certain formatters might enforce during runtime that the _transport must be buffered. $(QUESTION Should all formatters require buffered _transport? Otherwise they might need to keep their own buffering, which ends up being less efficient with buffered transports.) */ @property UnbufferedOutputTransport transport(); /// Ditto @property void transport(UnbufferedOutputTransport); /** Formats and writes an integral _value, including a UTF character. */ void put(ubyte value); /// Ditto void put(ushort value); /// Ditto void put(uint value); /// Ditto void put(ulong value); /// Ditto void put(byte value); /// Ditto void put(short value); /// Ditto void put(int value); /// Ditto void put(long value); /// Ditto void put(char value); /// Ditto void put(wchar value); /// Ditto void put(dchar value); /** Formats and writes a floating-point _value. */ void put(float value); /// Ditto void put(double value); /// Ditto void put(real value); /** Formats and writes a UTF-encoded string. $(QUESTION Should we also define $(D putln) that writes the string and then an line terminator?) */ void put(in char[] value); /// Ditto void put(in wchar[] value); /// Ditto void put(in dchar[] value); /** Formats and writes an array (other than strings). The type of the array element is passed dynamically as $(D elementType). */ void put(void[] value, TypeInfo elementType); /** Convenience generic function that accepts an array of any type and forwards it to $(D put(array, typeid(T.init))). Due to a bug in the implementation, this function has temporarily the name $(D put_) although it will ultimately be $(D put). */ final void put_(T)(in T[] array) if (!isSomeChar!T) { return put(array, typeid(T.init)); } /** Writes a class object to the stream. The stream must implement $(D toString(Formatter)). This function simply calls $(D obj.toString(this)), thereby closing a double dispatch loop. The responsibility of formatting the object's contents is left to the object. $(QUESTION Should we define a more involved protocol? For example, even for objects that don't implement formatting, a $(D Formatter) might define a reasonable output routine by using introspection to figure out the object's layout. This approach has the nice consequence that one implementation can be applied to many objects. But that also means we need to wait for better reflection support. We also need to figure out a way to detect that an object does not override $(D toString(Formatter)), which at the moment I consider a to-be-added primitive method of $(D Object).) */ void put(Object obj); /** Writes a struct to the stream. This final function writes a customizable "header" and a customizable "footer". Inside, the elements of the struct are formatted transitively. Due to a bug in the implementation, this function has temporarily the name $(D put_) although it will ultimately be $(D put). $(QUESTION Should we put some support for avoiding writing the same subobject twice, or is that more of a charter of serialization?) */ final void put_(S)(auto ref S) if (is(S == struct)) { } /** Overridable hooks called before and after writing a $(D struct)'s fields. $(QUESTION How to handle associative arrays? They don't have a common base, as arrays do. Should we offer some overridable hooks similar to these? For example, $(D beforeAssocArray), $(D afterAssocArray), $(D beforeAssocArrayElement), $(D afterAssocArrayElement).) */ void beforeStruct(void * s, TypeInfo ti); /// Ditto void afterStruct(void * s, TypeInfo ti); /** Formats and writes _data according to an extended $(D printf)-like format specifier. $(QUESTION How to define format specifiers for $(D struct)s and $(D class)es in ways that extend $(D printf) specifiers naturally?) $(QUESTION Should we define $(D writefln) too? Note that that only makes sense for streams that use a text-based transport.) */ void writef(in char[] format, Variant[] data...); } /** $(D Unformatter) in an interface for formatted read. The name $(D Parser) has been avoided in order to prevent confusion with the meaning of "parser" in formal grammars. */ interface Unformatter { /** Gets and sets the underlying _transport object. Each unformatter is associated with one _transport object. It is an error to attempt reads from an $(D Unformatter) that has a $(D null) _transport. Also, certain formatters might enforce during runtime that the transport must be buffered. */ @property UnbufferedInputTransport transport(); /// Ditto @property void transport(UnbufferedInputTransport); /** Reads an integral _value, including a UTF character. */ void read(ref ubyte value); /// Ditto void read(ref ushort value); /// Ditto void read(ref uint value); /// Ditto void read(ref ulong value); /// Ditto void read(ref byte value); /// Ditto void read(ref short value); /// Ditto void read(ref int value); /// Ditto void read(ref long value); /// Ditto void read(ref char value); /// Ditto void read(ref wchar value); /// Ditto void read(ref dchar value); /** Reads a floating-point _value. */ void read(ref float value); /// Ditto void read(ref double value); /// Ditto void read(ref real value); /** Reads a UTF-encoded string. $(QUESTION Should we pass the size in advance, or make the stream responsible for inferring it?) */ void read(ref char[] value); /// Ditto void read(ref wchar[] value); /// Ditto void read(ref dchar[] value); /** Formats and writes an array (other than strings). The type of the array element is passed dynamically as $(D elementType). */ void read(ref void[] value, TypeInfo elementType); /** Convenience generic function that accepts an array of any type and forwards it to $(D read(array, typeid(T.init))). Due to a bug in the implementation, this function has temporarily the name $(D read_) although it will ultimately be $(D read). */ final void read_(T)(in T[] array) if (!isSomeChar!T) { return put(array, typeid(T.init)); } /** Writes a class object to the stream. The stream must implement $(D toString(Formatter)). This function simply calls $(D obj.toString(this)), thereby closing a double dispatch loop. The responsibility of formatting the object's contents is left to the object. $(QUESTION Should we define a more involved protocol? For example, even for objects that don't implement formatting, a $(D Formatter) might define a reasonable output routine by using introspection to figure out the object's layout. This approach has the nice consequence that one implementation can be applied to many objects. But that also means we need to wait for better reflection support. We also need to figure out a way to detect that an object does not override $(D toString(Formatter)), which at the moment I consider a to-be-added primitive method of $(D Object).) */ void read(ref Object obj); /** Reads a struct from the stream. This final function reads a customizable "header" and a customizable "footer". Inside, the elements of the struct are formatted transitively. Due to a bug in the implementation, this function has temporarily the name $(D read_) although it will ultimately be $(D read). */ final void read_(S)(ref S) if (is(S == struct)) { } /** Overridable hooks called before and after writing a $(D struct)'s fields. $(QUESTION How to handle associative arrays? They don't have a common base, as arrays do. Should we offer some overridable hooks similar to these? For example, $(D beforeAssocArray), $(D afterAssocArray), $(D beforeAssocArrayElement), $(D afterAssocArrayElement).) */ void beforeStruct(void * s, TypeInfo ti); /// Ditto void afterStruct(void * s, TypeInfo ti); /** Convenience function that forwards to the appropriate by-reference overload. Due to a bug in the implementation, this function has temporarily the name $(D read_) although it will ultimately be $(D read). */ final T read_(T)() { T result; read(result); return result; } /** Reads _data according to an extended $(D scanf)-like format specifier. */ void readf(in char[] format, Variant[] data...); }