memcpy vs slice copy
Don
nospam at nospam.com
Mon Mar 16 02:34:33 PDT 2009
Sergey Gromov wrote:
> Sun, 15 Mar 2009 13:17:50 +0000 (UTC), Moritz Warning wrote:
>
>> On Sat, 14 Mar 2009 23:50:58 -0400, bearophile wrote:
>>
>>> While doing some string processing I've seen some unusual timings
>>> compared to the C code, so I have written this to see the situation
>>> better. When USE_MEMCPY is false this little benchmark runs about 3+
>>> times slower:
>> I did a little benchmark:
>>
>> ldc -release -O5
>> true: 0.51
>> false: 0.63
>>
>> dmd -release -O
>> true: 4.47
>> false: 3.58
>>
>> I don't see a very big difference between slice copying and memcpy (but
>> between compilers).
>>
>> Btw.: http://www.digitalmars.com/pnews/read.php?
>> server=news.digitalmars.com&group=digitalmars.D.bugs&artnum=14933
>
> The original benchmark swapped insanely on my 1GB laptop so I've cut the
> number of iterations in half, to 50_000_000. Compiled with -O -release
> -inline. Results:
>
> slice: 2.31
> memcpy: 0.73
>
> That's 3 times difference. Disassembly:
>
> slice:
> L31: mov ECX,EDX
> mov EAX,6
> lea ESI,010h[ESP]
> mov ECX,EAX
> mov EDI,EDX
> rep
> movsb
> add EDX,6
> add EBX,6
> cmp EBX,011E1A300h
> jb L31
>
> memcpy:
> L35: push 6
> lea ECX,014h[ESP]
> push ECX
> push EBX
> call near ptr _memcpy
> add EBX,6
> add ESI,6
> add ESP,0Ch
> cmp ESI,011E1A300h
> jb L35
>
> Seems like rep movsb is /way/ sub-optimal for copying data.
Definitely! The difference ought to be bigger than a factor of 3. Which
means that memcpy probably isn't anywhere near optimal, either.
rep movsd is always 4 times quicker than rep movsb. There's a range of
lengths for which rep movsd is optimal; outside that range, there's are
other options which are even faster.
So there's a factor of 4-8 speedup available on most memory copies.
Low-hanging fruit! <g>
More information about the Digitalmars-d
mailing list