A few measurements of stat()'s speed

Tue Mar 26 23:28:00 UTC 2019

On Tuesday, 26 March 2019 at 20:09:52 UTC, Andrei Alexandrescu 
wrote:
> On 3/26/19 2:36 PM, H. S. Teoh wrote:
>> Does caching the contents of import directories cause 
>> significant
>> overhead?  If not, why not just cache it anyway, regardless of 
>> whether
>> the import happens across network mounts.
>
> Because testing takes 10 minutes and implementation takes one 
> day or more. We want to make sure there's impact.
>
>> On a slightly different note, why are we paying so much 
>> attention to
>> import speeds anyway?
>
> You destroy your own opening point: work should be put where 
> there's potential for impact, not "regardless".
>
>> We can optimize import speeds to hell and back
>> again until they cost practically zero time, yet the act of 
>> actually
>> *using*  one of those imports -- ostensibly the reason you'd 
>> want to
>> import anything in the first place -- immediately adds a huge 
>> amount of
>> overhead that by far overshadows those niggly microseconds we 
>> pinched.
>> Ergo:
>> 
>> 	import std.regex;
>> 	void main() {
>> 		version(withRegex)
>> 			auto re = regex("a");
>> 	}
>> 
>> This takes about 0.5 seconds to compile without 
>> -version=withRegex on my
>> machine. With -version=withRegex, it takes about *4.5 seconds* 
>> to
>> compile.  We have a 4 second bottleneck here and yet we're 
>> trying to
>> shave off microseconds elsewhere.  Why does instantiating a
>> single-character regex add FOUR SECONDS to compilation time?  
>> I think
>> *that*  is the question we should be answering.
>
> There's a matter of difficulty. I don't have a good attack on 
> dramatically improving regexen. If you do, it would be of 
> course a high impact project. There's also a matter of paying 
> for what you don't use. Unused imports are, well, unused. Used 
> imports should be paid for in proportion. Agreed, 4.5 seconds 
> is not quite proportionate.

I've included a script below to generate and run a performance 
test.  Save it to your box as "gen", then run "./gen" to generate 
then test, then "./build" to run it.

I tried changing the "stat" calls to use "access" instead, but 
with around 70,000 system calls (found out using strace), it 
didn't make any noticeable difference.  With "stat" it was around 
2.2 seconds and was about the same with "access". So the issue is 
not with how much memory stat is returning, it the overhead of 
performing any system call.

#!/usr/bin/env python3
# Run "./gen" to generate file for performance test
# Run "./build" to run the test
import os
import stat

mod_count = 1000
path_count = 20

def mkdir(dir):
     if not os.path.exists(dir):
         os.mkdir(dir)

mkdir("out")
for i in range(0, path_count):
     mkdir("out/lib{}".format(i))

mkdir("out/mods")
for i in range(0, mod_count):
     with open("out/mods/mod{}.d".format(i), "w") as file:
     	for j in range(0, mod_count):
             file.write("import mod{};\n".format(j))

with open("out/main.d", "w") as file:
     for i in range(0, mod_count):
         file.write("import mod{};\n".format(i))
     file.write('void main() { import 
std.stdio;writeln("working"); }')

with open("build", "w") as file:
     file.write("#!/usr/bin/env bash\n")
     file.write('[ "$DMD" != "" ] || DMD=dmd\n')
     file.write("set -x\n")
     file.write("time $DMD \\\n")
     for i in range(0, path_count):
         file.write("  -I=out/lib{} \\\n".format(i))
     file.write("  -I=out/mods out/main.d $@")
os.chmod("build", stat.S_IRWXU | stat.S_IRWXG | stat.S_IROTH | 
stat.S_IXOTH)