std.csv Performance Review

Jesse Phillips via Digitalmars-d digitalmars-d at puremagic.com
Sun Jun 4 08:59:03 PDT 2017


On Sunday, 4 June 2017 at 06:15:24 UTC, H. S. Teoh wrote:
> On Sun, Jun 04, 2017 at 05:41:10AM +0000, Jesse Phillips via 
> Digitalmars-d wrote:
>> On Saturday, 3 June 2017 at 23:18:26 UTC, bachmeier wrote:
>> > Do you know what happened with fastcsv [0], original thread 
>> > [1].
>> > 
>> > [0] https://github.com/quickfur/fastcsv
>> > [1] 
>> > http://forum.dlang.org/post/mailman.3952.1453600915.22025.digitalmars-d-learn@puremagic.com
>> 
>> I do not. Rereading that in light of this new article I'm a 
>> little sceptical of the 51 times faster, since I'm seeing only 
>> 10x against these other implications.
> [...]
>
> You don't have to be skeptical, neither do you have to believe 
> what I claimed.  I posted the entire code I used in the 
> original thread, as well as the URLs of the exact data files I 
> used for testing.  You can just run it yourself and see the 
> results for yourself.

Ok, I took you up on that, I'm still skeptical:

LDC2 -O3 -release -enable-cross-module-inlining

std.csv: 12487 msecs
fastcsv (no gc): 1376 msecs
csvslicing: 3039 msecs

That looks like about 10 times faster to me. Using the slicing 
version failed because of \r\n line endings (guess multi-part 
separators is broken) I changed the data file so I could get the 
execution time.

Anyway, I'm not trying to claim fastcsv isn't good at what it 
does, all I'm trying to point out is std.csv is doing more work 
than these faster csv parsers. And I don't even want to claim 
that std.csv is better because of that work, it actually appears 
that it was a mistake to do validation.


More information about the Digitalmars-d mailing list