Read text file fast, how?

Andrei Alexandrescu via Digitalmars-d digitalmars-d at puremagic.com
Thu Jul 30 07:25:28 PDT 2015


On 7/27/15 8:03 AM, Johan Holmberg via Digitalmars-d wrote:
>
>
> On Sun, Jul 26, 2015 at 5:36 PM, Andrei Alexandrescu via Digitalmars-d
> <digitalmars-d at puremagic.com <mailto:digitalmars-d at puremagic.com>> wrote:
>
>     On 7/26/15 10:35 AM, Johan Holmberg via Digitalmars-d wrote:
>
>
>         On Sat, Jul 25, 2015 at 10:12 PM, Andrei Alexandrescu via
>         Digitalmars-d
>         <digitalmars-d at puremagic.com
>         <mailto:digitalmars-d at puremagic.com>
>         <mailto:digitalmars-d at puremagic.com
>         <mailto:digitalmars-d at puremagic.com>>> wrote:
>
>              On 7/25/15 1:53 PM, Johan Holmberg via Digitalmars-d wrote:
>         [...]
>                  I download a dmd 2.068 beta, and re-tried with my input
>         file:
>                  now the D
>                  program takes 1.6s (a 10x improvement).
>
>              Great, though it still seems to be behind the C++ version,
>         which is
>              a bummer. -- Andrei
>         [... linux numbers removed ...]
>
>
>     I think we should investigate this and bring performance to par.
>     Anyone interested? -- Andrei
>
>
>
> Back on MacOS again, I thought I should try to run "Instruments" on my
> program. I'm not familiar with the DMD source code, but I did the following:
>
> - downloaded the DMD source from Github + built it
> - rebuilt my program with this dmd
> - used Instruments (the MacOS profiler) on my program
>
> Two things showed up in Instruments that seemed suspicious, both in
> "stdio.d":
>
> 1) calls to "__tls_get_addr" inside readlnImpl" (taking 0.25s out of the
> total 1.69s according to Instruments). I added "__gshared" to the static
> variables "lineptr" and "n" to see if it had any effect (see below for
> results).
>
> 2) calls to "std.algorithm.endsWith" inside File.ByLine.Impl.popFront
> (taking 0.10s according to Intruments). I replaced it with a simpler
> test using inline code.
>
> The timings running my program normally (not using Instruments now),
> became as follows with the different versions of dmd:
>
> dmd unmodified: 1.59s
> dmd with change 1): 1.33s
> dmd with change 1+2): 1.22s
> C++ using <stdio.h>: 1.13s    (for comparison)
>
> My changes to dmd are of course not correct, but my program still works
> as before at least. If 1) and 2) could be changed "the right way" the
> difference to the C++ program would be much smaller on MacOS (I haven't
> looked further into the Linux results).
>
> Does this help getting forward?
>
> /johan
>

Thanks, yes, this is a great start.

Would anyone want to refine these insights into a pull requests?


Andrei



More information about the Digitalmars-d mailing list