Streaming library

Wed Oct 13 13:19:45 PDT 2010

On 10/13/10 14:02 CDT, Denis Koroskin wrote:
> On Wed, 13 Oct 2010 20:55:04 +0400, Andrei Alexandrescu
> <SeeWebsiteForEmail at erdani.org> wrote:
>> http://www.gnu.org/s/libc/manual/html_node/Buffering-Concepts.html
>>
>> I don't think streams must mimic the low-level OS I/O interface.
>>
>
> I in contrast think that Streams should be a lowest-level possible
> platform-independent abstraction.
> No buffering besides what an OS provides, no additional functionality.
> If you need to be able to read something up to some character (besides,
> what should be considered a new-line separator: \r, \n, \r\n?), this
> should be done manually in "byLine".

This aggravates client code for the sake of simplicity in a library that 
was supposed to make streaming easy. I'm not seeing progress.

>>> That's because
>>> most of the steams are binary streams, and there is no such thing as a
>>> "line" in them (e.g. how often do you need to read a line from a
>>> SocketStream?).
>>
>> http://www.opengroup.org/onlinepubs/009695399/functions/isatty.html
>>
>
> These are special cases I don't like. There is no such thing in Windows
> anyway.

I didn't say I like them. Windows has _isatty: 
http://msdn.microsoft.com/en-us/library/f4s0ddew(v=VS.80).aspx

>> You need a line when e.g. you parse a HTML header or a email header or
>> an FTP response. Again, if at a low level the transfer occurs in
>> blocks, that doesn't mean the API must do the same at all levels.
>>
>
> BSD sockets transmits in blocks. If you need to find a special sequence
> in a socket stream, you are forced to fetch a chunk, and manually search
> for a needed sequence. My position is that you should do it with an
> external predicate (e.g. read until whitespace).

Problem is how you set up interfaces to avoid inefficiencies and 
contortions in the client.

>>> I don't think streams should buffer anything either (what an underlying
>>> OS I/O API caches should suffice), buffered streams adapters can do that
>>> in a stream-independent way (why duplicate code when you can do that as
>>> efficiently with external methods?).
>>
>> Most OS primitives don't give access to their own internal buffers.
>> Instead, they ask user code to provide a buffer and transfer data into
>> it.
>
> Right. This is why Stream may not cache.

This is a big misunderstanding. If the interface is:

size_t read(byte[] buffer);

then *I*, the client, need to provide the buffer. It's in client space. 
This means willing or not I need to do buffering, regardless of whatever 
internal buffering is going on under the wraps.

>> So clearly buffering on the client side is a must.
>>
>
> I don't see how is it implied from above.

Please implement an abstraction that given this:

interface InputStream
{
     size_t read(ubyte[] buf);
}

defines a line reader.

Andrei