Read text file fast, how?

Johan Holmberg via Digitalmars-d digitalmars-d at puremagic.com
Mon Jul 27 05:03:28 PDT 2015


On Sun, Jul 26, 2015 at 5:36 PM, Andrei Alexandrescu via Digitalmars-d <
digitalmars-d at puremagic.com> wrote:

> On 7/26/15 10:35 AM, Johan Holmberg via Digitalmars-d wrote:
>
>>
>> On Sat, Jul 25, 2015 at 10:12 PM, Andrei Alexandrescu via Digitalmars-d
>> <digitalmars-d at puremagic.com <mailto:digitalmars-d at puremagic.com>> wrote:
>>
>>     On 7/25/15 1:53 PM, Johan Holmberg via Digitalmars-d wrote:
>> [...]
>>         I download a dmd 2.068 beta, and re-tried with my input file:
>>         now the D
>>         program takes 1.6s (a 10x improvement).
>>
>>     Great, though it still seems to be behind the C++ version, which is
>>     a bummer. -- Andrei
>> [... linux numbers removed ...]
>>
>
> I think we should investigate this and bring performance to par. Anyone
> interested? -- Andrei
>


Back on MacOS again, I thought I should try to run "Instruments" on my
program. I'm not familiar with the DMD source code, but I did the following:

- downloaded the DMD source from Github + built it
- rebuilt my program with this dmd
- used Instruments (the MacOS profiler) on my program

Two things showed up in Instruments that seemed suspicious, both in
"stdio.d":

1) calls to "__tls_get_addr" inside readlnImpl" (taking 0.25s out of the
total 1.69s according to Instruments). I added "__gshared" to the static
variables "lineptr" and "n" to see if it had any effect (see below for
results).

2) calls to "std.algorithm.endsWith" inside File.ByLine.Impl.popFront
(taking 0.10s according to Intruments). I replaced it with a simpler test
using inline code.

The timings running my program normally (not using Instruments now), became
as follows with the different versions of dmd:

dmd unmodified: 1.59s
dmd with change 1): 1.33s
dmd with change 1+2): 1.22s
C++ using <stdio.h>: 1.13s    (for comparison)

My changes to dmd are of course not correct, but my program still works as
before at least. If 1) and 2) could be changed "the right way" the
difference to the C++ program would be much smaller on MacOS (I haven't
looked further into the Linux results).

Does this help getting forward?

/johan
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.puremagic.com/pipermail/digitalmars-d/attachments/20150727/b681bf0c/attachment.html>


More information about the Digitalmars-d mailing list