iopipe v0.0.4 - RingBuffers!
Steven Schveighoffer
schveiguy at yahoo.com
Fri May 11 14:57:14 UTC 2018
On 5/11/18 10:04 AM, Uknown wrote:
> On Friday, 11 May 2018 at 13:28:58 UTC, Steven Schveighoffer wrote:
>> What I have learned here is:
>>
>> 1. Ring buffers are really cool (I still love how it works) and
>> perform as well as normal buffers
>> 2. The use cases are much smaller than I thought
>> 3. In most real-world applications, they are a wash, and not worth the
>> OS tricks needed to use it.
>
> Now I need to learn all about ring-buffers. Do you have any good
> starting points?
I would start here: https://en.wikipedia.org/wiki/Circular_buffer
The point of a circular buffer is to avoid having to copy any data
anywhere -- things stay in place as long as they are in-buffer.
So originally, I had intended to make a ring buffer (or circular buffer)
where I have a random access range between 2 separate segments. In other
words, the range would abstract the fact that the buffer was 2 segments,
and give you random access to each element by checking to see which
segment it's in.
I never actually made it, because I realized quickly while optimizing
e.g. line processing that the huge benefit of having sequential memory
far outweighs the drawback of occasionally having to copy data. In other
words, you are paying for every element access to avoid paying for a
rare small copy.
Consider a byline range. If you have a 8k buffer, and your lines are
approximately 80 bytes in length average, when you reach the end of the
buffer and have to move whatever existing partial-line to the front of
the buffer to continue reading, you are really only copying 1% of the
buffer, 1% of the time. But while you are searching for line endings
(99% of the time), you are using a simple indexed pointer dereference.
Contrast that with a disjoint buffer where every access to an element
first requires a check to see which segment you are in before
dereferencing. You have moved the payment from the 1% into the 99%.
BUT, when at dconf, Dmitry and Shachar let me know about a technique to
map the same memory segment to 2 consecutive address ranges. This allows
you to look at the ring buffer without it ever being disjoint. Simply
put, you have a 2x buffer, whereby each half looks at the same memory.
Whenever your buffer start gets to the half way point, you simply move
the pointers back by half a buffer. Other than that, the code is nearly
identical to a straight allocated buffer, and the memory access is just
as fast.
So I decided to implement, hoping that I would magically just get a bit
better performance. I should have known better :)
>> I'm thinking it's even simpler than that. All matches are dead on a
>> line break (it's how grep normally works), so you simply have to parse
>> the lines and run each one via regex. What I don't know is how much it
>> costs regex to startup and run on an individual line.
>>
>> One thing I could do to amortize is keep 2N lines in the buffer, and
>> run the regex on a whole context's worth of lines, then dump them all.
>
> iopipe is looking like a great library!
Thanks! I hope to get more utility out of it. I still need to
finish/publish my json parser based on it, and I'm thinking we need some
parsing tools really to go on top of it to make things easier to approach.
> That reminds me of this great blog post detailing grep's performance:
> http://ridiculousfish.com/blog/posts/old-age-and-treachery.html
>
> Also, one of the original authors of grep wrote about its performance
> optimizations, for anyone interested:
> https://lists.freebsd.org/pipermail/freebsd-current/2010-August/019310.html
Thanks, I'll take a look at those.
-Steve
More information about the Digitalmars-d-announce
mailing list