Speed of csvReader

data pulverizer via Digitalmars-d-learn digitalmars-d-learn at puremagic.com
Thu Jan 21 07:17:08 PST 2016


On Thursday, 21 January 2016 at 14:56:13 UTC, Saurabh Das wrote:
> On Thursday, 21 January 2016 at 14:32:52 UTC, Saurabh Das wrote:
>> On Thursday, 21 January 2016 at 13:42:11 UTC, Edwin van 
>> Leeuwen wrote:
>>> On Thursday, 21 January 2016 at 09:39:30 UTC, data pulverizer 
>>> wrote:
>>>>   StopWatch sw;
>>>>   sw.start();
>>>>   auto buffer = std.file.readText("Acquisition_2009Q2.txt");
>>>>   auto records = csvReader!row_type(buffer, '|').array;
>>>>   sw.stop();
>>>
>>>
>>> Is it csvReader or readText that is slow? i.e. could you move 
>>> sw.start(); one line down (after the readText command) and 
>>> see how long just the csvReader part takes?
>>
>> Please try this:
>>
>> auto records = 
>> File("Acquisition_2009Q2.txt").byLine.joiner("\n").csvReader!row_type('|').array;
>>
>> Can you put up some sample data and share the number of 
>> records in the file as well.
>
> Actually since you're aiming for speed, this might be better:
>
> sw.start();
> auto records = 
> File("Acquisition_2009Q2.txt").byChunk(1024*1024).joiner.map!(a 
> => cast(dchar)a).csvReader!row_type('|').array
> sw.stop();
>
> Please do verify that the end result is the same - I'm not 100% 
> confident of the cast.
>
> Thanks,
> Saurabh

@Saurabh I have tried your latest suggestion and the time reduces 
fractionally to:

Time (s): 6.345

the previous suggestion actually increased the time

@Edwin van Leeuwen The csvReader is what takes the most time, the 
readText takes 0.229 s


More information about the Digitalmars-d-learn mailing list