buffered input

Sat Feb 5 13:44:25 PST 2011

"Andrei Alexandrescu" <SeeWebsiteForEmail at erdani.org> wrote in message 
news:iijq99$1a5o$1 at digitalmars.com...
> On 2/5/11 6:46 AM, spir wrote:
>> On 02/05/2011 10:36 AM, Nick Sabalausky wrote:
>>> On a separate note, I think a good way of testing the design (and end up
>>> getting something useful anyway) would be to try to use it to create a
>>> range
>>> that's automatically-buffered in a more traditional way. Ie, Given any
>>> input
>>> range 'myRange', "buffered(myRange, 2048)" (or something like that) 
>>> would
>>> wrap it in a new input range that automatically buffers the next 2048
>>> elements (asynchronously?) whenever its internal buffer is exhausted. Or
>>> something like that. It's late and I'm tired and I can't think anymore 
>>> ;)
>>
>> That's exactly what I'm expecting.
>> Funnily enough, I was about to start a thread on the topic after reading
>> related posts. My point was:
>> "I'm not a specialist in efficiency (rather the opposite), I just know
>> there is --theoretically-- relevant performance loss to expect from
>> unbuffered input in various cases. Could we define a generic
>> input-buffering primitive allowing people to benefit from others'
>> competence? Just like Appender."
>>
>> Denis
>
> The buffered range interface as I defined it supports infinite lookahead. 
> The interface mentioned by Nick has lookahead between 1 and 2048. So I 
> don't think my interface is appropriate for that.
>
> Infinite lookahead is a wonderful thing. Consider reading lines from a 
> file. Essentially what you need to do is to keep on reading blocks of data 
> until you see \n (possibly followed by some more stuff). Then you offer 
> the client the line up to the \n. When the client wants a new line, you 
> combine the leftover data you already have with new stuff you read. On 
> occasion you need to move over leftovers, but if your internal buffers are 
> large enough that is negligible (I actually tested this recently).
>
> Another example: consider dealing with line continuations in reading CSV 
> files. Under certain conditions, you need to read one more line and stitch 
> it with the existing one. This is easy with infinite lookahead, but quite 
> elaborate with lookahead 1.
>

I think I can see how it might be worthwhile to discourage the traditional 
buffer interface I described in favor of the above. It wouldn't be as 
trivial to use as what people are used to, but I can see that it could avoid 
a lot of unnessisary copying, especially with other people's suggestion of 
allowing the user to provide their own buffer to be filled (and it seems 
easy enough to learn).

But what about when you want a circular buffer? Ie, When you know a certain 
maximum lookahead is fine and you want to minimize memory usage and 
buffer-appends. Circular buffers don't do infinite lookahead so the 
interface maybe doesn't work as well. Plus you probably wouldn't want to 
provide an interface for slicing into the buffer, since the slice could 
straddle the wrap-around point which would require a new allocation (ie 
"return buffer[indexOfFront+sliceStart..$] ~ 
buffer[0..sliceLength-($-(frontIndex+sliceStart))]"). I guess maybe that 
would just call for another type of range.