Parallel ByLine, ByChunk?

dsimcha dsimcha at yahoo.com
Tue Aug 9 11:58:22 PDT 2011


It's sometimes useful to read a file by line or by chunk such that the next
element is fetched in a background thread while the current one is being used.
 std.parallelism.asyncBuf almost fits the bill here, except that, since the
buffers in std.stdio.File.ByLine and ByChunk are recycled, they need to be
duplicated by asyncBuf.  This is inefficient.

Is there any interest in a ParallelByChunk and ParallelByLine in
std.stdio.File?  These would have a set of some user specified size nBuffers
of buffers, which would be recycled.  A background thread would read ahead
until all buffers are full, then wait on a condition variable.  When
popFront() has been called several times and a large number of buffers are
available, the background thread would be woken to fill the newly available
buffers.  This would all happen in parallel with the client using each
chunk/line obtained by front() and would be fully encapsulated and safe, even
though there would probably be some low-level concurrency (i.e. using
core.thread and core.sync instead of std.concurrency or std.parallelism)
involved in the implementation.

If there is interest, does anyone have any suggestions for making this buffer
recycling range solution more general instead of such an ad-hoc solution that
requires re-writing byChunk and byLine and using low-level concurrency?  I'd
like to encapsulate this pattern and put it in std.parallelism somehow, but I
can't think of an easy way.


More information about the Digitalmars-d mailing list