CSV - 2.5 GB/sec (sadly C++)

Laeeth Isharc via Digitalmars-d-learn digitalmars-d-learn at puremagic.com
Sat Oct 29 21:56:24 PDT 2016


On Tuesday, 26 January 2016 at 22:36:31 UTC, H. S. Teoh wrote:
> So the moral of the story is: avoid large numbers of small 
> allocations. If you have to do it, consider consolidating your 
> allocations into a series of allocations of large(ish) buffers 
> instead, and taking slices of the buffers.

Thanks for sharing this, HS Teoh.  I tried replacing allocations 
with using a Region from std.experimental.allocator (with 
FreeList and Quantizer on top), and then just deallocating 
everything in one go once I am done with the data.  Seems to be a 
little faster, but I haven't had time to measure it.

Just came across this C++ project, which seems to have 
astonishing performance.  7 minutes for reading a terabyte, and 
2.5 to 4.5 GB/sec for reading file cold.  That's pretty 
impressive.  (Obviously they read in parallel, but I haven't yet 
read source to see what the other tricks might be).

It would be nice to be able match that in D, though practically 
speaking it's probably easiest just to wrap it:

http://www.wise.io/tech/paratext

https://github.com/wiseio/paratext


More information about the Digitalmars-d-learn mailing list