Read text file fast, how?
Andrei Alexandrescu via Digitalmars-d
digitalmars-d at puremagic.com
Sun Jul 26 08:36:37 PDT 2015
On 7/26/15 10:35 AM, Johan Holmberg via Digitalmars-d wrote:
>
> On Sat, Jul 25, 2015 at 10:12 PM, Andrei Alexandrescu via Digitalmars-d
> <digitalmars-d at puremagic.com <mailto:digitalmars-d at puremagic.com>> wrote:
>
> On 7/25/15 1:53 PM, Johan Holmberg via Digitalmars-d wrote:
>
> Thanks, my question seems like a carbon copy of the Stack Overflow
> article :) Somehow I had missed it when googling.
>
> I download a dmd 2.068 beta, and re-tried with my input file:
> now the D
> program takes 1.6s (a 10x improvement).
>
>
> Great, though it still seems to be behind the C++ version, which is
> a bummer. -- Andrei
>
>
> My C++ program was actually doing C-style IO via <stdio.h>. I didn't
> think about the distinction C/C++ when reporting the earlier numbers.
>
> If I switch to full C++ style: <fstream> + <string> + C++ version of
> getline(), then the C++-solution is even slower than Python: 5.2s. I
> think it is the C++ libraries of Clang on MacOS Yosemite that are slow.
>
> This prompted me to re-run the tests on a Linux machine (Ubuntu 14.04),
> still with the same input file, a text file with 7M lines and total size
> of 466MB:
>
> C++ with <stdio.h> style IO: 0.40s
> C++ with <fstream> style IO: 0.31s
> D 2.067 1.75s
> D 2.068 beta 2: 0.69s
> Perl: 1.49s
> Python: 1.86s
>
> So on Ubuntu, the C++ <fstream> version was clearly best. And the
> improvement in DMD 2.068 beta "only" a factor of 2.5 from 2.067.
>
> /johan
I think we should investigate this and bring performance to par. Anyone
interested? -- Andrei
More information about the Digitalmars-d
mailing list