Time to move std.experimental.checkedint to std.checkedint ?

Wed Mar 31 05:49:25 UTC 2021

On 3/31/21 1:16 AM, Vladimir Panteleev wrote:
> On Wednesday, 31 March 2021 at 04:08:02 UTC, Andrei Alexandrescu wrote:
>> Not much to write home about. The jumps scale linearly with the number 
>> of primitive operations:
>>
>> https://godbolt.org/z/r3sj1T4hc
> 
> Right, but as we both know, speed doesn't necessarily scale with the 
> number of instructions for many decades now.

Of course, and I wasn't suggesting the contrary. If speed would simply 
increase by decreasing instructions retired, inliners would be much more 
agreesive etc. etc. But such statements need to be carefully qualified 
which is why I do my best to not make them in isolation. The 
qualification here would be... "except most of the case when it does". 
Instructions retired is generally a telling proxy.

> Curiosity got the better of me and I played with this for a bit.
> 
> Here is my program:
> 
> https://dump.cy.md/d7b7ae5c2d15c8c0127fd96dd74909a1/main.zig
> 
> Two interesting observations:
> 
> 1. The compiler (whether it's the Zig frontend or the LLVM backend) is 
> smart about adding the checks. If it can prove that the values will 
> never overflow, then the overflow checks aren't emitted. I had to trick 
> it into thinking that they may overflow, when in practice they never will.
> 
> 1b. The compiler is actually that aware of the checks, that in one of my 
> attempts to get it to always emit them, it actually generated a version 
> of the function with and without the checks, and called the unchecked 
> version in the case where it knew that it will never overflow! Amazing!
> 
> 2. After finally getting it to always generate the checks, and 
> benchmarking the results, the difference in run time I'm seeing between 
> ReleaseFast and ReleaseSafe is a measly 2.7%. The disassembly looks all 
> right too: https://godbolt.org/z/3nY7Ee4ff
> 
> Personally, 2.7% is a price I'm willing to pay any day, if it helps save 
> me from embarrassments like https://github.com/CyberShadow/btdu/issues/1 :)

That's in line with expectations for a small benchmarks. On larger 
applications the impact of bigger code on the instruction cache would be 
more detrimental. (Also the branch predictor is a limited resource so 
more jumps means decreased predictability of others; not sure how that 
compares in magnitude with the impact on instruction cache, which is a 
larger and more common problem.)