How to prevent optimizer from reordering stuff?

Dan Olson via digitalmars-d-ldc digitalmars-d-ldc at puremagic.com
Sat Mar 14 13:20:50 PDT 2015


"David Nadlinger" <code at klickverbot.at> writes:

> On Saturday, 14 March 2015 at 18:42:45 UTC, Dan Olson wrote:
>> While tracking down std.math problems for ARM, I find that optimizer
>> will reorder instructions to get FPSCR flags before the divide
>> operation.
>
> IIRC FP flag/mode support is a tricky topic in LLVM in general, but
> this specific problem seems weird. What are the attributes for
> __D3std4math9ieeeFlagsFNdZS3std4math9IeeeFlags in the IR? The
> optimizer should never move code across arbitrary function calls…
>
> David

Hi David.

I don't see any attributes for for that function.  I will just paste
some of the -output-ll results since nothing sticks out to me.

declare fastcc void @_D3std4math9ieeeFlagsFNdZS3std4math9IeeeFlags(%std.math.IeeeFlags* noalias sret)

define fastcc void @_D10unittester3fooFZv() {
  %flags = alloca %std.math.IeeeFlags, align 4
  %1 = load double* @_D10unittester4zeroe, align 8
  %2 = fdiv double 1.000000e+00, %1
  %3 = tail call i32 asm sideeffect "vmrs $0, fpscr", "=r"() #0
  call fastcc void @_D3std4math9ieeeFlagsFNdZS3std4math9IeeeFlags(%std.math.IeeeFlags* noalias sret %flags)
  %tmp = call fastcc i1 @_D3std4math9IeeeFlags9divByZeroMFNdZb(%std.math.IeeeFlags* %flags)
  %4 = zext i1 %tmp to i32
  %tmp1 = call i32 (i8*, ...)* @printf(i8* getelementptr inbounds ([11 x i8]* @.str12, i32 0, i32 0), double %2, i32 %3, i32 %4)
  ret void
}

The only guess I have right now for this is from:

http://infocenter.arm.com/help/topic/com.arm.doc.ihi0042e/IHI0042E_aapcs.pdf

  The FPSCR is the only status register that may be accessed by
  conforming code. It is a global register with the following
  properties:

  - The condition code bits (28-31), the cumulative saturation (QC) bit
    (27) and the cumulative exception-status bits (0-4) are not
    preserved across a public interface.

  (snip)

Maybe that means the compiler can says FPSCR state from my vdiv.f64
is undefined across function call boundaries, so ordering should not
matter?


More information about the digitalmars-d-ldc mailing list