What are the best available D (not C) File input/output options?

confuzzled con at fuzzled.com
Mon Nov 6 01:26:15 UTC 2023


Good morning,

First, thanks to you, Steve, and Julian for responding to my inquiry.

On 11/3/23 4:59 AM, Sergey wrote:
> On Thursday, 2 November 2023 at 15:46:23 UTC, confuzzled wrote:
>> I've ported a small script from C to D. The original C version takes 
>> roughly 6.5 minutes to parse a 12G file while the port originally took 
>> about 48 minutes.
> 
> In my experience I/O in D is quite slow.
> But you can try to improve it:
> 
> Try to use std.outbuffer instead of writeln. And flush the result only 
> in the end.

Unless I did it incorrectly, this did nothing for me. My understanding 
is that I should first prepare an OutBuffer to which I write all my 
output. Once complete, I then write the OutBuffer to file; which still 
requires the use of writeln, albeit not as often.

First I tried buffering the entire thing, but that turned out to be a 
big mistake. Next I tried writing and clearing the buffer every 100_000 
records (about 3000 writeln calls).

Not as bad as the first attempt but significantly worse than what I 
obtained with the fopen/fprintf combo. I even tried writing the buffer 
to disk with fprintf but jumped ship because it took far longer than 
fopen/fprintf. Can't say how much longer because I terminated execution 
at 14 minutes.

> Also check this article. It is showing how manual buffers in D could 
> speed up the processing of files significantly: 
> https://tech.nextroll.com/blog/data/2014/11/17/d-is-for-data-science.html
> 
> 

The link above was quite helpful. Thanks. I am a bit slow on the uptake 
so it took a while to figure out how to apply the idea to my own use 
case. However, once I figured it out, the result was 2 minutes faster 
than the original C implementation and 3 minutes faster than the 
fopen/printf port.

Whether it did anything for the writeln implementation or not, I don't 
know. Wasn't will to wait 45+ minutes for something that can feasibly be 
done in 6 minutes. I gave up at 12.

Haven't played with std.string.representation as suggested by Julian as 
yet but I plan to.


Thank again.
--Confuzzled


More information about the Digitalmars-d-learn mailing list