Why GNU coreutils/dd is creating a dummy file more efficiently than D's For loop?
Cym13
cpicard at purrfect.fr
Thu May 23 09:44:15 UTC 2019
On Thursday, 23 May 2019 at 09:09:05 UTC, BoQsc wrote:
> This code of D creates a dummy 47,6 MB text file filled with
> Nul characters in about 9 seconds
>
> import std.stdio, std.process;
>
> void main() {
>
> writeln("Creating a dummy file");
> File file = File("test.txt", "w");
>
> for (int i = 0; i < 50000000; i++)
> {
> file.write("\x00");
> }
> file.close();
>
> }
>
>
> While GNU coreutils dd can create 500mb dummy Nul file in a
> second.
> https://github.com/coreutils/coreutils/blob/master/src/dd.c
>
> What are the explanations for this?
If you're talking about benchmarking it's important to provide
both source code and how you use/compile them. However in that
case I think I can point you in the right direction already:
I'll suppose that you used something like that:
dd if=/dev/zero of=testfile bs=1M count=500
Note in particular the blocksize argument. I set it to 1M but by
default it's 512 bytes. If you use strace with the command above
you'll see a series of write() calls, each writting 1M of null
bytes to testfile. That's the main difference between your code
and what dd does: it doesn't write 1 byte at a time. This results
in way less system calls and system calls are very expensive.
To go fast, read/write bigger chunks.
I may be wrong though, maybe you tested with a bs of 1 byte, so
test for yourself and if necessary provide all informations and
not just pieces so that we are able to reproduce your test :)
More information about the Digitalmars-d-learn
mailing list