Make using compiled libs with debug code better
Steven Schveighoffer
schveiguy at gmail.com
Mon Oct 18 13:04:47 UTC 2021
I wrote about this in a reply to a learn thread, but realized I should
really post it here on its own.
A user posted seemingly innocuous code that caused a segmentation fault
[here](https://forum.dlang.org/post/nkfejgkrzntybqetkvkj@forum.dlang.org),
and also asked about it on Discord. We worked through trying to figure
out the issue, and it turns out it was a memory corruption (the seg
fault was deep inside the garbage collector).
But not one that I would have expected. It was an out-of-bounds access
passed into a
[std.bitmanip.BitArray](https://dlang.org/phobos/std_bitmanip.html#BitArray),
which corrupted GC metadata, and caused a later allocation to fail. Like
any reasonable array type, `BitArray` uses asserts to enforce bounds.
And he did NOT turn off bounds checks.
So what happened? Phobos is compiled in release mode, even if *your code
is not*. So bounds checks are disabled based on the type.
If `BitArray` was a template, it might have bounds checks enabled. But
maybe not, if the compiler detected it was already instantiated inside
Phobos and decided it didn't need to emit the code for it. This kind of
Russian Roulette seems prone to cause hours of debugging when it could
give you an answer instantly.
How do we fix such a thing? The easiest thing I can think of is to ship
an assert-enabled version of Phobos, and use that when requested. This
at least allows a quick mechanism to test your code with bounds checks
(and other asserts) enabled. LDC actually already has this, see
https://d.godbolt.org/z/feETE8W78 (courtesy of Max Haughton)
But this feels awkward. If you build your code with bounds checks, you
would like the libraries you use to have them also. And a compiler
switch isn't going to help you when you are using other libs.
Now consider that `BitArray`, being quite old, uses contracts to enforce
its bounds checks. However, D runs the contracts inside the function
itself, instead of at the call site. But this seems wrong to me, as the
contracts are checking the *caller's* code, not the *callee's*.
Would it be a reasonable thing to change contracts to be called by the
caller instead of the callee? Is that something that could make its way
into D, such that the input checking of functions that are compiled for
release still can run when you compile your code in non-release mode?
-Steve
More information about the Digitalmars-d
mailing list