Improving IO Speed

Fri Mar 14 15:55:53 PDT 2014

On Friday, 14 March 2014 at 19:11:12 UTC, Craig Dillabaugh wrote:
> On Friday, 14 March 2014 at 18:00:58 UTC, TJB wrote:
>> I have a program in C++ that I am translating to D as a way to 
>> investigate and learn D. The program is used to process 
>> potentially hundreds of TB's of financial transactions data so 
>> it is crucial that it be performant. Right now the C++ version 
>> is orders of magnitude faster.
>>
>> Here is a simple example of what I am doing in D:
>>
>> import std.stdio : writefln;
>> import std.stream;
>>
>> align(1) struct TaqIdx
>> {
>>  align(1) char[10] symbol;
>>  align(1) int tdate;
>>  align(1) int begrec;
>>  align(1) int endrec;
>> }
>>
>> void main()
>> {
>>  auto input = new File("T201212A.IDX");
>>  TaqIdx tmp;
>>  int count;
>>
>>  while(!input.eof())
>>  {
>>    input.readExact(&tmp, TaqIdx.sizeof);
>>   // Do something with the data
>>  }
>> }
>>
>> Do you have any suggestions for improving the speed in this 
>> situation?
>>
>> Thank you!
>>
>> TJB
>
> I am not sure how std.stream buffers data (the library has been 
> marked for removal, so perhaps not very efficiently), but what 
> happens if you read in a large array of your TaqIdx structs 
> with each read.

Well, one thing that I found out by experimentation was that if I 
replace

auto input = new File("T201212A.IDX");

with

auto input = new BufferedFile("T201212A.IDX");

The performance gap vanishes.  Now I have nearly identical 
execution times between the two codes.  But perhaps if std.stream 
is scheduled for removal I shouldn't be using it?