Make using compiled libs with debug code better

Steven Schveighoffer schveiguy at gmail.com
Mon Oct 18 13:04:47 UTC 2021


I wrote about this in a reply to a learn thread, but realized I should 
really post it here on its own.

A user posted seemingly innocuous code that caused a segmentation fault 
[here](https://forum.dlang.org/post/nkfejgkrzntybqetkvkj@forum.dlang.org), 
and also asked about it on Discord. We worked through trying to figure 
out the issue, and it turns out it was a memory corruption (the seg 
fault was deep inside the garbage collector).

But not one that I would have expected. It was an out-of-bounds access 
passed into a 
[std.bitmanip.BitArray](https://dlang.org/phobos/std_bitmanip.html#BitArray), 
which corrupted GC metadata, and caused a later allocation to fail. Like 
any reasonable array type, `BitArray` uses asserts to enforce bounds. 
And he did NOT turn off bounds checks.

So what happened? Phobos is compiled in release mode, even if *your code 
is not*. So bounds checks are disabled based on the type.

If `BitArray` was a template, it might have bounds checks enabled. But 
maybe not, if the compiler detected it was already instantiated inside 
Phobos and decided it didn't need to emit the code for it. This kind of 
Russian Roulette seems prone to cause hours of debugging when it could 
give you an answer instantly.

How do we fix such a thing? The easiest thing I can think of is to ship 
an assert-enabled version of Phobos, and use that when requested. This 
at least allows a quick mechanism to test your code with bounds checks 
(and other asserts) enabled. LDC actually already has this, see 
https://d.godbolt.org/z/feETE8W78 (courtesy of Max Haughton)

But this feels awkward. If you build your code with bounds checks, you 
would like the libraries you use to have them also. And a compiler 
switch isn't going to help you when you are using other libs.

Now consider that `BitArray`, being quite old, uses contracts to enforce 
its bounds checks. However, D runs the contracts inside the function 
itself, instead of at the call site. But this seems wrong to me, as the 
contracts are checking the *caller's* code, not the *callee's*.

Would it be a reasonable thing to change contracts to be called by the 
caller instead of the callee? Is that something that could make its way 
into D, such that the input checking of functions that are compiled for 
release still can run when you compile your code in non-release mode?

-Steve


More information about the Digitalmars-d mailing list