in not working for arrays is silly, change my view

Steven Schveighoffer schveiguy at
Tue Mar 3 00:51:34 UTC 2020

On 3/2/20 7:32 PM, aliak wrote:
> On Monday, 2 March 2020 at 23:27:22 UTC, Steven Schveighoffer wrote:
>> What I think is happening is that it determines nobody is using the 
>> result, and the function is pure, so it doesn't bother calling that 
>> function (probably not even the lambda, and then probably removes the 
>> loop completely).
>> I'm assuming for some reason, the binary search is not flagged pure, 
>> so it's not being skipped.
> Apparently you're right: 

That's not definitive. Note that a template member or member of a struct 
template can be *inferred* to be pure.

It's also entirely possible for the function to be pure, but the 
compiler decides for another reason not to elide the whole thing. 
Optimization isn't ever guaranteed.

>> If I change to this to ensure side effects:
>> bool makeImpure; // TLS variable outside of main
>> ...
>>     auto results = benchmark!(
>>         () => makeImpure = r1.canFind(max),
>>         () => makeImpure = r2.contains(max),
>>         () => makeImpure = r3.canFind(max),
>>     )(5_000);
>> writefln("%(%s\n%)", results); // modified to help with the comma 
>> confusion
>> I now get:
>> 4 secs, 428 ms, and 3 hnsecs
>> 221 μs and 9 hnsecs
>> 4 secs, 49 ms, 982 μs, and 5 hnsecs
>> More like what I expected!
> Ahhhh damn! And here I was thinking that branch prediction made a HUGE 
> difference! Ok, I'm taking my tail and slowly moving away now :) Let us 
> never speak of this again.

LOL, I'm sure this will come up again ;) The forums are full of 
confusing benchmarks where LDC has elided the whole thing being tested. 
It's amazing at optimizing. Sometimes, too amazing.

On 3/2/20 6:46 PM, H. S. Teoh wrote:
 > To prevent the optimizer from eliding "useless" code, you need to do
 > something with the return value that isn't trivial (assigning to a
 > variable that doesn't get used afterwards is "trivial", so that's not
 > enough). The easiest way is to print the result: the optimizer cannot
 > elide I/O.

Yeah, well, that means you are also benchmarking the i/o (which would 
dwarf the other pieces being tested).

I think assigning the result to a global fits the bill pretty well, but 
obviously only works when you're not inside a pure function.


More information about the Digitalmars-d-learn mailing list