Speed of csvReader

H. S. Teoh via Digitalmars-d-learn digitalmars-d-learn at puremagic.com
Thu Jan 21 16:50:12 PST 2016


On Thu, Jan 21, 2016 at 04:31:03PM -0800, H. S. Teoh via Digitalmars-d-learn wrote:
> On Thu, Jan 21, 2016 at 04:26:16PM -0800, H. S. Teoh via Digitalmars-d-learn wrote:
[...]
> > 	https://github.com/quickfur/fastcsv
> 
> Oh, forgot to mention, the parsing times are still lightning fast
> after the fixes I mentioned: still around 1190 msecs or so.
> 
> Now I'm tempted to actually implement doubled-quote interpretation...
> as long as the input file doesn't contain unreasonable amounts of
> doubled quotes, I'm expecting the speed should remain pretty fast.
[...]

Done, commits pushed to github.

The new code now parses doubled quotes correctly.  The performance is
slightly worse now, around 1300 msecs on average, even in files that
don't have any doubled quotes (it's a penalty incurred by the inner loop
needing to detect doubled quote sequences).

My benchmark input file doesn't have any doubled quotes, however (code
correctness with doubled quotes is gauged by unittests only); so the
performance numbers may not accurately reflect true performance in the
general case. (But if doubled quotes are rare, as I'm expecting, the
actual performance shouldn't change too much in general usage...)

Maybe somebody who has a file with lots of ""'s can run the benchmark to
see how badly it performs? :-P


T

-- 
Heuristics are bug-ridden by definition. If they didn't have bugs, they'd be algorithms.


More information about the Digitalmars-d-learn mailing list