in not working for arrays is silly, change my view
Steven Schveighoffer
schveiguy at gmail.com
Tue Mar 3 00:51:34 UTC 2020
On 3/2/20 7:32 PM, aliak wrote:
> On Monday, 2 March 2020 at 23:27:22 UTC, Steven Schveighoffer wrote:
>>
>> What I think is happening is that it determines nobody is using the
>> result, and the function is pure, so it doesn't bother calling that
>> function (probably not even the lambda, and then probably removes the
>> loop completely).
>>
>> I'm assuming for some reason, the binary search is not flagged pure,
>> so it's not being skipped.
>
> Apparently you're right:
> https://github.com/dlang/phobos/blob/5e13653a6eb55c1188396ae064717a1a03fd7483/std/range/package.d#L11107
That's not definitive. Note that a template member or member of a struct
template can be *inferred* to be pure.
It's also entirely possible for the function to be pure, but the
compiler decides for another reason not to elide the whole thing.
Optimization isn't ever guaranteed.
>
>
>>
>> If I change to this to ensure side effects:
>>
>> bool makeImpure; // TLS variable outside of main
>>
>> ...
>>
>> auto results = benchmark!(
>> () => makeImpure = r1.canFind(max),
>> () => makeImpure = r2.contains(max),
>> () => makeImpure = r3.canFind(max),
>> )(5_000);
>>
>> writefln("%(%s\n%)", results); // modified to help with the comma
>> confusion
>>
>> I now get:
>> 4 secs, 428 ms, and 3 hnsecs
>> 221 μs and 9 hnsecs
>> 4 secs, 49 ms, 982 μs, and 5 hnsecs
>>
>> More like what I expected!
>
> Ahhhh damn! And here I was thinking that branch prediction made a HUGE
> difference! Ok, I'm taking my tail and slowly moving away now :) Let us
> never speak of this again.
LOL, I'm sure this will come up again ;) The forums are full of
confusing benchmarks where LDC has elided the whole thing being tested.
It's amazing at optimizing. Sometimes, too amazing.
On 3/2/20 6:46 PM, H. S. Teoh wrote:
> To prevent the optimizer from eliding "useless" code, you need to do
> something with the return value that isn't trivial (assigning to a
> variable that doesn't get used afterwards is "trivial", so that's not
> enough). The easiest way is to print the result: the optimizer cannot
> elide I/O.
Yeah, well, that means you are also benchmarking the i/o (which would
dwarf the other pieces being tested).
I think assigning the result to a global fits the bill pretty well, but
obviously only works when you're not inside a pure function.
-Steve
More information about the Digitalmars-d-learn
mailing list