Taming the optimizer
Mike Franklin
slavo5150 at yahoo.com
Thu Jun 14 03:39:39 UTC 2018
I'm trying to run benchmarks on my memcpy implementation
(https://forum.dlang.org/post/trenuawrekkbewjudmsy@forum.dlang.org) using LDC with optimizations enabled (e.g. LDC -O3 memcpyd.d). In my first implementation, the optimizer stripped out most of the code I was trying to measure.
Using the information at
https://stackoverflow.com/questions/40122141/preventing-compiler-optimizations-while-benchmarking, I've created this:
void use(void* p)
{
version(LDC)
{
import ldc.llvmasm;
__asm("", "r", p);
}
}
void clobber()
{
version(LDC)
{
import ldc.llvmasm;
__asm("","~{memory}");
}
}
// `f` is the function I wish to benchmark. it's an
// implementation of memcpy in D
Duration benchmark(T, alias f)(const T* src, T* dst)
{
enum iterations = 10_000_000;
Duration result;
auto sw = StopWatch(AutoStart.yes);
sw.reset();
foreach (_; 0 .. iterations)
{
f(src, dst);
use(dst);
clobber();
}
result = sw.peek();
return result;
}
This seems to work, but I don't know that I've implemented it
properly; especially the `use` function. How would you write
this to achieve a real-world optimized measurement? What's the
equivalent of...
static void escape(void *p) {
asm volatile("" : : "g"(p) : "memory");
}
... in LDC inline assembly?
Thanks,
Mike
More information about the digitalmars-d-ldc
mailing list