Speed of csvReader

Fri Jan 22 23:01:52 PST 2016

On Fri, Jan 22, 2016 at 10:04:58PM +0000, data pulverizer via Digitalmars-d-learn wrote:
[...]
> >$ dmd file_read_5.d fastcsv.d
> >$ ./file_read_5
> >Time (s): 0.679
> >
> >Fastest so far, very nice.

Thanks!

> I guess the next step is allowing Tuple rows with mixed types.

I thought about that a little today. I'm guessing that most of the
performance will be dependent on the conversion into the target types.
Right now it's extremely fast because, for the most part, it's just
taking slices of an existing string. It shouldn't be too hard to extend
the current code so that instead of assembling the string slices in a
block buffer, it will run them through std.conv.to instead and store
them in an array of some given struct. But there may be performance
degradation because now we have to do non-trivial operations on the
string slices.

Converting from const(char)[] to string probably should be avoided where
not necessary, since otherwise it will involve lots and lots of small
allocations and the GC will become very slow. Converting to ints may not
be too bad... but conversion to types like floating point may be quite
slow. Now, assembling the resulting structs into an array could
potentially be slow... but perhaps an analogous block buffer technique
can be used to create the array piecemeal in separate blocks, and only
perform the final assembly into a single array at the very end (thus
avoiding reallocating and copying the growing array as we go along).

But we'll see.  Performance predictions are rarely accurate; only a
profiler will tell the truth about where the real bottlenecks are. :-)

T

-- 
LINUX = Lousy Interface for Nefarious Unix Xenophobes.