Streaming transport interfaces: input

Denis Koroskin 2korden at gmail.com
Thu Oct 14 10:56:19 PDT 2010


On Thu, 14 Oct 2010 21:39:03 +0400, Andrei Alexandrescu  
<SeeWebsiteForEmail at erdani.org> wrote:

> On 10/14/10 12:27 CDT, Steven Schveighoffer wrote:
>> On Thu, 14 Oct 2010 11:34:12 -0400, Andrei Alexandrescu
>> <SeeWebsiteForEmail at erdani.org> wrote:
>> Please, use the term "seek", and allow an anchor. Every OS allows this,
>> it makes no sense not to provide it.
>
> I've always thought that's a crappy appendix. Every OS that ever allows  
> seek/tell with anchors allows ALL anchors, and always allows either both  
> or none of seek and tell. So I decided to cut through the crap and  
> simplify. You want to seek 100 bytes from here, you write  
> stream.position = stream.position + 100.
>
> Oh, that reminds me I need to provide length as a property as well. This  
> would save us crap like seek(0, SEEK_END); ftell() to figure out the  
> length of a file.
>
> I have no sympathy for seek and tell with anchors.
>
>> I don't like appendDelim. We don't need to define that until we have
>> buffering.
>
> Why?
>
>> The simple function of an input stream is to read data.
>
> It does read data.
>
>> With
>> buffering you get all the goodies that you want, but the buffer should
>> be in control of its data buffer.
>
> I think the appendDelim method allows fast and simple implementations of  
> a variety of patterns. As I (thought I) have shown elsethread, without  
> appendDelim there's no way to efficiently implement a line-oriented  
> stream on top of a block-oriented one.
>
>> Basically, appendDelim can be defined outside this class, because the
>> primitive read is enough.
>
> You can only define it if you accept extra copying. I'd say one extra  
> interface function is acceptable for fast I/O.
>
>> Shouldn't the text transport be defined on top of the binary transport?
>
> No, because there are transports that genuinely do not accept binary  
> data.
>
>> And in any case, I'd expect buffering to go between the two.
>
> How do you define buffering? Would a buffered transport implement a  
> different interface?
>
>> If all you
>> are adding are the different widths of characters, I don't think you
>> need this extra layer. It's going to make the buffering layer more
>> difficult to implement (now it must handle both a text version and
>> abinary version).
>
> I don't understand this.
>
>
> Andrei

appendDelim *requires* buffering for to be implemented. No OS provides an  
API to read from a file (be it pipe, socket, whatever) to read up to some  
abstract delimiter. It *always* reads in blocks. As such, if you need to  
read until a delimeter, you need to fetch block to some internal buffer,  
MANUALLY search through it and THEN copy to output string. I've  
implemented that on top of chunked read interface, and it was 5% faster  
than getline()/getdelim() that GNU libc provides (despite you claming it  
to be "many times faster"). It's not.

Buffering requires and additional level of data copying, and this is bad  
for fast I/O. If you need fast I/O or must pull that out of the stream  
interface. Otherwise chunked read will be less efficient due to additional  
copies to and from buffers.

On the contrary line-based reading can be implemented on top of the  
chunked read without sacrificing a tiny bit of efficiency.


More information about the Digitalmars-d mailing list