Memory corruption with -O3, but not -O2 (and not with DMD)

David Nadlinger code at klickverbot.at
Tue Aug 20 08:18:27 UTC 2019


Dear James,

As mentioned by kinke elsewhere, this is pretty much impossible to track 
down from our end without more information.

Is the corruption deterministic across multiple runs of one particular 
executable? Putting a memory watchpoint to the address that gets 
corrupted might provide some extra clues as to where it comes from.

Having a look at the LLVM IR (-output-ll) might also be illuminating; I 
personally find it easier to read than assembly. In particular, you 
could use the LLVM `opt` tool to apply the -O3 passes one by one, 
compiling to object code, linking and testing every step of the way, and 
compare the IR before/after the pass that first introduces the crash. 
(The `bugpoint` tool has some support for this, but you might be quicker 
doing this manually.)

Best regards,
David


On 17 Aug 2019, at 21:33, James Blachly via digitalmars-d-ldc wrote:

> Hi all,
>
> First , as always thanks for LDC2 without which we couldn't write high 
> performance D software for our lab.
>
> I've run in to an problem wherein after ~60,000 iterations of a loop 
> we get memory corruption, but only when building with LDC2 and -O3; 
> there are no problems AFAICT with -O2, or when building optimized 
> versions with DMD. -enable-inlining does not make a difference. All of 
> that being said, it does not rule out me making a pointer or memory 
> error, but all seems well except with LDC2 -O3.
>
> Debugging has been difficult because -O3 optimizes away a lot and thus 
> lldb is not able to show me the tracking debugging variables I need to 
> isolate the problematic code. Disassembling, I've found the register 
> storing pointer to the corrupt string; interestingly, the correct 
> string appears just slightly lower on the heap (maybe 32 bytes IIRC).
>
> Manifestation slightly nondeterministic -- adding tracking variables 
> and code makes the problem intermittent.
>
> Indeed, I placed some simple guards (e.g.: if (pre_string != 
> post_string) throw new Exception() ) near the place where the corrupt 
> memory is manifest and sometimes the guard is triggered, while other 
> times it is _not_, but the bad string shows up just a few statements 
> later (inside a function).
>
> Am at my wits' end, so help or next steps are greatly appreciated. I 
> can provide disassembly of whatever combination of -O/-O2/-O3 and 
> triggering/nontriggering code blocks if it would e helpful.
>
> If this should move to github, let me know.
>
> Kind regards


More information about the digitalmars-d-ldc mailing list