On Wednesday, 18 February 2015 at 20:05:58 UTC, Jonathan Marler wrote: > > If I turn on optimization they both take 7 milliseconds. You cannot benchmark it like this. To make it more realistic you should use multiple compilation units, add fences and cache invalidation.