[Issue 10820] New: curly brakets prevent inlining with DMD

d-bugmail at puremagic.com d-bugmail at puremagic.com
Wed Aug 14 09:34:00 PDT 2013


http://d.puremagic.com/issues/show_bug.cgi?id=10820

           Summary: curly brakets prevent inlining with DMD
           Product: D
           Version: D2
          Platform: All
        OS/Version: All
            Status: NEW
          Severity: critical
          Priority: P2
         Component: DMD
        AssignedTo: nobody at puremagic.com
        ReportedBy: monarchdodra at gmail.com


--- Comment #0 from monarchdodra at gmail.com 2013-08-14 09:33:58 PDT ---
DMD 2.064 BETA
DMD affected. gdc unaffected. ldc2 untested.

I was doing some benchmarks on a very tight loop, and I discovered that when a
branch is "curly" enclosed, then it prevents inlining:

Here are 4 equivalent functions:
uint foo1(char c) @safe pure nothrow
{
    if (c < 0x80)
        return 1;
    else
        return 2;
}
uint foo2(char c) @safe pure nothrow
{
    if (c < 0x80)
    {
        return 1;
    }
    else
    {
        return 2;
    }
}

uint foo3(char c) @safe pure nothrow
{
    if (c < 0x80)
        return 1;
    else
        return foo3impl(c);
}
uint foo3impl(char c) @safe pure nothrow
{
    return 2;
}
uint foo4(char c) @safe pure nothrow
{
    if (c < 0x80)
    {
        return 1;
    }
    else
        return foo4impl(c);
}
uint foo4impl(char c) @safe pure nothrow
{
    return 2;
}

And a program then benches them:

//----
import std.stdio, std.datetime;
enum N = 5000;

void main()
{
    char c = 'a';
    StopWatch st1;
    StopWatch st2;
    StopWatch st3;
    StopWatch st4;
    immutable len = 1000;

    foreach(_ ; 0 .. N)
    {
        for (size_t i ; i < len ; )
        {
            i += foo1(c);
            i += foo2(c);
            i += foo3(c);
            i += foo4(c);
        }
    }

    foreach(K ; 0 .. 10)
    {
        st1.start;
        foreach(_ ; 0 .. N)
        {
            size_t i = 0;
            while(i != len)
                i += foo1(c);
        }
        st1.stop;
        st2.start;
        foreach(_ ; 0 .. N)
        {
            size_t i = 0;
            while(i != len)
                i += foo2(c);
        }
        st2.stop;
        st3.start;
        foreach(_ ; 0 .. N)
        {
            size_t i = 0;
            while(i != len)
                i += foo3(c);
        }
        st3.stop;
        st4.start;
        foreach(_ ; 0 .. N)
        {
            size_t i = 0;
            while(i != len)
                i += foo4(c);
        }
        st4.stop;
    }
    writefln("foo1: %sms.", st1.peek.msecs);
    writefln("foo2: %sms.", st2.peek.msecs);
    writefln("foo3: %sms.", st3.peek.msecs);
    writefln("foo4: %sms.", st4.peek.msecs);
}
//----

When compiled with DMD without -inline:
foo1: 2338ms.
foo2: 2337ms.
foo3: 2333ms.
foo4: 2337ms.

when compiled with DMD with -O -inline:
foo1: 282ms.
foo2: 2244ms.
foo3: 282ms.
foo4: 2246ms.

This is very strange, as foo1 and foo2 are *strictly* equivalent, save for some
curlies, and so are foo3 and foo4. As a matter of fact, in my original usecase,
I got better performance by cascading function calls to remove curlies, rather
than have blocks in my ifs.

I don't have any proof, but I'd be willing to bet this is a cause for *major*
performance issues for DMD.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------


More information about the Digitalmars-d-bugs mailing list