buffered input

Sat Feb 5 16:28:49 PST 2011

spir wrote:
> On 02/05/2011 10:44 PM, Nick Sabalausky wrote:
>> "Andrei Alexandrescu"<SeeWebsiteForEmail at erdani.org>  wrote in message
>> news:iijq99$1a5o$1 at digitalmars.com...
>>> On 2/5/11 6:46 AM, spir wrote:
>>>> On 02/05/2011 10:36 AM, Nick Sabalausky wrote:
>>>>> On a separate note, I think a good way of testing the design (and 
>>>>> end up
>>>>> getting something useful anyway) would be to try to use it to create a
>>>>> range
>>>>> that's automatically-buffered in a more traditional way. Ie, Given any
>>>>> input
>>>>> range 'myRange', "buffered(myRange, 2048)" (or something like that)
>>>>> would
>>>>> wrap it in a new input range that automatically buffers the next 2048
>>>>> elements (asynchronously?) whenever its internal buffer is 
>>>>> exhausted. Or
>>>>> something like that. It's late and I'm tired and I can't think anymore
>>>>> ;)
>>>>
>>>> That's exactly what I'm expecting.
>>>> Funnily enough, I was about to start a thread on the topic after 
>>>> reading
>>>> related posts. My point was:
>>>> "I'm not a specialist in efficiency (rather the opposite), I just know
>>>> there is --theoretically-- relevant performance loss to expect from
>>>> unbuffered input in various cases. Could we define a generic
>>>> input-buffering primitive allowing people to benefit from others'
>>>> competence? Just like Appender."
>>>>
>>>> Denis
>>>
>>> The buffered range interface as I defined it supports infinite 
>>> lookahead.
>>> The interface mentioned by Nick has lookahead between 1 and 2048. So I
>>> don't think my interface is appropriate for that.
>>>
>>> Infinite lookahead is a wonderful thing. Consider reading lines from a
>>> file. Essentially what you need to do is to keep on reading blocks of 
>>> data
>>> until you see \n (possibly followed by some more stuff). Then you offer
>>> the client the line up to the \n. When the client wants a new line, you
>>> combine the leftover data you already have with new stuff you read. On
>>> occasion you need to move over leftovers, but if your internal 
>>> buffers are
>>> large enough that is negligible (I actually tested this recently).
>>>
>>> Another example: consider dealing with line continuations in reading CSV
>>> files. Under certain conditions, you need to read one more line and 
>>> stitch
>>> it with the existing one. This is easy with infinite lookahead, but 
>>> quite
>>> elaborate with lookahead 1.
>>>
>>
>> I think I can see how it might be worthwhile to discourage the 
>> traditional
>> buffer interface I described in favor of the above. It wouldn't be as
>> trivial to use as what people are used to, but I can see that it could 
>> avoid
>> a lot of unnessisary copying, especially with other people's 
>> suggestion of
>> allowing the user to provide their own buffer to be filled (and it seems
>> easy enough to learn).
>>
>> But what about when you want a circular buffer? Ie, When you know a 
>> certain
>> maximum lookahead is fine and you want to minimize memory usage and
>> buffer-appends. Circular buffers don't do infinite lookahead so the
>> interface maybe doesn't work as well. Plus you probably wouldn't want to
>> provide an interface for slicing into the buffer, since the slice could
>> straddle the wrap-around point which would require a new allocation (ie
>> "return buffer[indexOfFront+sliceStart..$] ~
>> buffer[0..sliceLength-($-(frontIndex+sliceStart))]"). I guess maybe that
>> would just call for another type of range.
> 
> Becomes too complicated, doesnt it?
> 
> Denis

Circular buffers don't seem like an 'optional' use case to me. Most real 
I/O works that way.
I think if the abstraction can't handle it, the abstraction is a failure.