How to track down a bad llvm optimization pass
David Nadlinger via digitalmars-d-ldc
digitalmars-d-ldc at puremagic.com
Thu Jun 30 11:10:04 PDT 2016
On 30 Jun 2016, at 16:40, Joakim via digitalmars-d-ldc wrote:
> I assumed that undef was some kind of poison value
undef is indeed "some kind of poison value", in that each use of it
evaluates to a (potentially different) arbitrary bit string. By itself,
using an undef isn't undefined behaviour, but of course for many
operations it ultimately is, because there are bit string inputs for
which these operations are undefined (e.g. loads, stores).
LLVM knows a concept called "poison values" too, which are undefs with
slightly stronger semantics produced by C-style signed integer
arithmetic overflow and similar operations – in loose terms, any
operation that depends on them in an externally visible way has
undefined behaviour.
I usually find the LLVM language reference
(http://llvm.org/docs/LangRef.html) to be quite a clear resource for
these sorts of questions.
> [should] the inlining pass […] just be returning undef […]? Since
> this is at compile-time, I don't think it should. […]
> Are we supposed to be running sanitizers or something else to avoid
> these bugs?
First off, as it currently stands, this is certainly not an issue in
LLVM. The lshr instruction is documented as resulting in undefined
behaviour when used with an out-of-range shift. Replacing the whole call
with `undef` is thus a valid IR transformation.
So far for LLVM working as designed. The question of course becomes
whether, being a compiler writer's tool, it would be nice for it to emit
a warning on such transformations. And here things suddenly become
muddy. Yes, in this case, getting a warning would be useful. However, if
the code was not actually reachable dynamically, a warning would be
wrong. Of course, this can be solved by offering a way to declare basic
blocks/functions to be considered reachable for that purpose, but that
introduces extra complexity – I wouldn't be surprised if the fact that
you'd need to design something along these lines was the main reason why
LLVM does not try to report such conditions.
Of course, language frontends can always emit dynamical checks to avoid
executing llvm::Instructions with UB-inducing arguments, whether in the
form of sanitisers, or by default as part of faithfully lowering their
semantics.
— David
More information about the digitalmars-d-ldc
mailing list