stream interfaces - with ranges

Fri May 18 08:13:53 PDT 2012

On Fri, 18 May 2012 10:39:55 -0400, kenji hara <k.hara.pg at gmail.com> wrote:

> 2012/5/18 Steven Schveighoffer <schveiguy at yahoo.com>:
>> On Fri, 18 May 2012 00:19:45 -0400, kenji hara <k.hara.pg at gmail.com>  
>> wrote:
>>
>>> I think range interface is not useful for *efficient* IO. The expected
>>> IO interface will be more *abstract* than range primitives.
>>
>>
>> If all you are doing is consuming data and processing it, range  
>> interface is
>> efficient.  Most streaming implementations that are synchronous use:
>>
>> 1. read block of data from low-level source into buffer
>> 2. process buffer
>> 3. If still data left, go to step 1.
>>
>> 1 is done via popFront, 2 is done via front.
>>
>> 3 is somewhat available via empty, but empty kind of depends on reading
>> data.  I think it can work.
>>
>> It's not the ideal interface for all aspects of i/o, but it does map to
>> ranges, and for single purpose tasks (such as parse an XML file), it  
>> will be
>> most efficient.
>
> Almost agree. When we want to do I/O, that is synchronous or  
> asynchronous.
> Only a few people would use non-blocking interface.
> But for the library implementation, non-blocking interface is still  
> important.
> I think the non-blocking interface should be designed to avoid copying
> as far as possible, and to achieve it with range interface is
> impossible in general.

On non-blocking i/o, why not just not support range interface at all?  I  
don't have any problem with that.  In other words, if your input source is  
non-blocking, and you try to use range primitives, it simply won't work.

I admit, all of my code so far is focused on blocking i/o.  I have some  
experience with non-blocking i/o, but it was to make a blocking interface  
that supported waiting for data with a timeout.  Making a cross-platform  
(i.e. both windows and Posix) non-blocking interface is difficult because  
you use very different mechanisms on both OSes.

And a lot of times, you don't want non-blocking i/o, but rather parallel  
i/o.

>>> ---
>>> If you use range I/F to read bytes from device, we will always do
>>> blocking IO - even if the device is socket. It is not efficient.
>>>
>>> auto sock = new TcpSocketDevice();
>>> if (sock.empty) { auto e = sock.front; }
>>>  // In empty primitive, we *must* wait the socket gets one or more
>>> bytes or really disconnected.
>>>  // If not, what exactly returns sock.front?
>>>  // Then using range interface for socket reading enforces blocking
>>> IO. It is *really* inefficient.
>>> ---
>>
>>
>> sockets do not have to be blocking, and I/O does not have to use the  
>> range
>> portion of the interface.
>>
>> And efficient I/O has little to do with synchronicity and more to do  
>> with
>> reading a large amount of data at a time instead of byte by byte.
>>
>> Using multi-threads or fibers, and using OS primitives such as select or
>> poll can make I/O quite efficient and allow you to do other things  
>> while no
>> I/O is happening.  These will not happen with range interface, but will  
>> be
>> available through other interfaces.
>
> I have talked about *good I/O primitives for library implementation*.
> I think range interface is one of the most useful concept for end
> users, but not good one for people who want to implement efficient
> libraries.

OK, I think we agree.  I am concerned about writing good library types  
that can efficiently use I/O.  The range interface will be for people who  
use the library and want to utilize existing range primitives for whatever  
purpose.

>
>>> I think IO primitives must be distinct from range ones for the reasons
>>> mentioned above...
>>
>>
>> Yes, I agree.  But ranges can be *mapped* to stream primitives.
>
> No, we cannot map output range concept to non-blocking output. 'put'
> operation always requires blocking.

Yes, but again, put can use whatever stream primitives we have.

In other words, it's quite possible to write range primitives which  
utilize stream primitivies.  It's impossible to write good stream  
primitives which utilize range primitives.

>
>>> I'm designing experimental IO primitives:
>>> https://github.com/9rnsr/dio
>>
>>
>> I'll take a look.
>
> Thanks.

I'm having trouble following the code, is there a place with the generated  
docs?   I'm looking for an overview to understand where to look.

Your lib is quite extensive, mine is only one file ;)

>
>>>
>>> In other words, range is not almighty. We should think distinct
>>> primitives for the IO.
>>
>>
>> 100% agree.  The main thing I realized that brought me to propose the
>> "range-based" (if you can call it that) version is that:
>>
>> 1. Ranges can be readily mapped to stream primitives *if* you use the
>> concept of a range of T[] vs. a range of T.  So in essence, without  
>> changing
>> anything I can slap on a range interface for free.
>> 2. Arrays make very efficient data sources, and are easy to create.  We  
>> need
>> a way to hook stream-using code onto an array.
>>
>> But be clear, I am *not* going to remove the existing stream I/O  
>> primitives
>> I had for buffered i/o, I'm rather *adding* range primitives as well.
>
> My policy is very similar. But, as described above, I think range
> cannot cover non-blocing IO.
> And I think non-blocking IO interface is important for library  
> implementations.

I think you misunderstand, I'm not trying to make ranges be the base of  
i/o, I'm trying to expose a range interface *based on* stream i/o  
interface.

-Steve