Speed of csvReader

Thu Jan 21 14:13:38 PST 2016

On Thursday, 21 January 2016 at 21:24:49 UTC, H. S. Teoh wrote:
> [snip]
> There are some limitations to this approach: while the current 
> code does try to unwrap quoted values in the CSV, it does not 
> correctly parse escaped double quotes ("") in the fields. This 
> is because to process those values correctly we'd have to copy 
> the field data into a new string and construct its interpreted 
> value, which is slow.  So I leave it as an exercise for the 
> reader to implement (it's not hard, when the double 
> double-quote sequence is detected, allocate a new string with 
> the interpreted data instead of slicing the original data. 
> Either that, or just unescape the quotes in the application 
> code itself).

What about wrapping the slices in a range-like interface that 
would unescape the quotes on demand? You could even set a flag on 
it during the initial pass to say the field has double quotes 
that need to be escaped so it doesn't need to take a per-pop 
performance hit checking for double quotes (that's probably a 
pretty minor boost, if any, though).