[Issue 14458] New: very slow ubyte[] assignment (dmd doesn't use memset)

via Digitalmars-d-bugs digitalmars-d-bugs at puremagic.com
Fri Apr 17 13:08:29 PDT 2015


https://issues.dlang.org/show_bug.cgi?id=14458

          Issue ID: 14458
           Summary: very slow ubyte[] assignment (dmd doesn't use memset)
           Product: D
           Version: unspecified
          Hardware: All
                OS: All
            Status: NEW
          Severity: normal
          Priority: P1
         Component: DMD
          Assignee: nobody at puremagic.com
          Reporter: code at dawg.eu

Tracked down a severe performance issue in my new AA implementation, where it
zeroed a freshly allocated entry.

DMD generates the following code for the assignment.
----
void zero(ubyte[] ary) { ary[] = 0; }
----
        mov     rcx, rdi                                ; 0008 _ 48: 89. F9
        xor     rax, rax                                ; 000B _ 48: 31. C0
        mov     rdi, rsi                                ; 000E _ 48: 8B. FE
        rep stosb                                       ; 0011 _ F3: AA
----

This is a bytewise store 0 and is about 4x slower than memset, if sz >= 4. It's
slightly faster for sz < 4.
Not sure why `rep stosb` suddenly becomes 4x slower when sz increases from 3 to
4 bytes, but in any case the compiler should optimize the small case to direct
assignments and the big case to memset, or always use memset.

--


More information about the Digitalmars-d-bugs mailing list