D is for Data Science
Dmitry Olshansky via Digitalmars-d-announce
digitalmars-d-announce at puremagic.com
Mon Nov 24 15:36:37 PST 2014
25-Nov-2014 01:28, bearophile пишет:
> Dmitry Olshansky:
>
>>> Why is File.byLine so slow?
>>
>> Seems to be mostly fixed sometime ago.
>
> Really? I am not so sure.
>
> Bye,
> bearophile
I too has suspected it in the past and then I tested it.
Now I test it again, it's always easier to check then to argue.
Two minimal programs
//my.d:
import std.stdio;
void main(string[] args) {
auto file = File(args[1], "r");
size_t cnt=0;
foreach(char[] line; file.byLine()) {
cnt++;
}
}
//my2.d
import core.stdc.stdio;
void main(string[] args) {
char[] buf = new char[32768];
size_t cnt;
shared(FILE)* file = fopen(args[1].ptr, "r");
while(fgets(buf.ptr, cast(int)buf.length, file) != null){
cnt++;
}
fclose(file);
}
In the below console session, log file - is my dmsg log replicated many
times (34 megs total).
dmitry at Ubu64 ~ $ wc -l log
522240 log
dmitry at Ubu64 ~ $ du -hs log
34M log
# touch it, to have it in disk cache:
dmitry at Ubu64 ~ $ cat log > /dev/null
dmitry at Ubu64 ~ $ dmd my
dmitry at Ubu64 ~ $ dmd my2
dmitry at Ubu64 ~ $ time ./my2 log
real 0m0.062s
user 0m0.039s
sys 0m0.023s
dmitry at Ubu64 ~ $ time ./my log
real 0m0.181s
user 0m0.155s
sys 0m0.025s
~4 time in user mode, okay...
Now with full optimizations, ranges are very sensitive to optimizations:
dmitry at Ubu64 ~ $ dmd -O -release -inline my
dmitry at Ubu64 ~ $ dmd -O -release -inline my2
dmitry at Ubu64 ~ $ time ./my2 log
real 0m0.065s
user 0m0.042s
sys 0m0.023s
dmitry at Ubu64 ~ $ time ./my2 log
real 0m0.063s
user 0m0.040s
sys 0m0.023s
Which is 1:1 parity. Another myth busted? ;)
--
Dmitry Olshansky
More information about the Digitalmars-d-announce
mailing list