Time to move std.experimental.checkedint to std.checkedint ?
tsbockman
thomas.bockman at gmail.com
Tue Mar 30 23:01:44 UTC 2021
On Tuesday, 30 March 2021 at 17:53:37 UTC, Walter Bright wrote:
> On 3/30/2021 10:09 AM, tsbockman wrote:
>> So you're now dismissing Zig as slow because its feature set
>> surprised you?
>
> Because it surprised me? No. Because if someone had figured out
> a way to do overflow checks for no runtime costs, it would be
> in every language. I know Rust tried pretty hard to do it.
Zero runtime cost is not a reasonable standard unless the feature
is completely worthless and it cannot be turned off.
>> No real-world data is necessary? No need to understand any of
>> Zig's relevant optimizations or options?
>
> I don't have to test a brick to assume it won't fly. But I
> could be wrong, definitely. If you can prove me wrong in my
> presumption, I'm listening.
Since I have already been criticized for the use of
micro-benchmarks, I assume that only data from complete practical
applications will satisfy.
Unfortunately, the idiomatic C, C++, D, and Rust source code all
omit the information required to perform such tests. Simply
flipping compiler switches (the -ftrapv and -fwrapv flags in gcc
Andrei mentioned earlier) won't work, because most high
performance code contains some deliberate and correct examples of
wrapping overflow, signed-unsigned reinterpretation, etc.
Idiomatic Zig code (probably Ada, too) does contain this
information. But, the selection of "real world" open source Zig
code available for testing is limited right now, since Zig hasn't
stabilized the language or the standard library yet.
The best test subject I have found, compiled, and run
successfully is this:
https://github.com/Vexu/arocc
It's an incomplete C compiler: "Right now preprocessing and
parsing is mostly done but anything beyond that is missing." I
believe compilation is a fairly integer-intensive workload, so
the results should be meaningful.
To test, I took the C source code of gzip and duplicated its
contents many times until I got the arocc wall time up to about 1
second. (The final input file is 37.5 MiB.) arocc outputs a long
stream of error messages to stderr, whose contents aren't
important for our purposes.
In order to minimize the time consumed by I/O, I run each test
several times in a row and ignore the early runs, to ensure that
the input file is cached in RAM by the OS, and pipe the output of
arocc (both stdout and stderr) to /dev/null.
Results with -O ReleaseSafe (optimizations on, with checked
integer arithmetic, bounds checks, null checks, etc.):
Binary size: 2.0 MiB
Wall clock time: 1.31s
System time: 0.71s
User time: 0.60s
CPU usage: 99% of a single core
Results with -O ReleaseFast (optimizations on, with safety checks
off):
Binary size: 2.3 MiB
Wall clock time: 1.15s
System time: 0.68s
User time: 0.46s
CPU usage: 99% of a single core
So, in this particular task ReleaseSafe (which checks for a lot
of other things, not just integer overflow) takes 14% longer than
ReleaseFast. If you only care about user time, that is 48% longer.
Last time I checked, these numbers are similar to the performance
difference between optimized builds by DMD and LDC/GDC. They are
also similar to the performance differences within related
language pairs like C/C++, Java/C#, Ada/C in language comparison
benchmarks like:
https://benchmarksgame-team.pages.debian.net/benchmarksgame/fastest/cpp.html
Note also that with Zig's approach, paying the modest performance
penalty for the various safety checks is *completely optional* in
release builds (just like D's bounds checking). Even for
applications where that final binary order of magnitude of speed
is considered essential in production, Zig's approach still leads
to clearer, easier to debug code.
So, unless DMD (or C itself!) is "a brick" that "won't fly", your
claim that this is something that a high performance systems
programming language just cannot do is not grounded in reality.
More information about the Digitalmars-d
mailing list