Replacing C's memcpy with a D implementation
David Nadlinger
code at klickverbot.at
Sun Jun 17 16:01:10 UTC 2018
On Monday, 11 June 2018 at 08:02:42 UTC, Walter Bright wrote:
> On 6/10/2018 9:44 PM, Patrick Schluter wrote:
>> See what Agner Fog has to say about it:
>
> Thanks. Agner Fog gets the last word on this topic!
Well, Agner is rarely wrong indeed, but there is a limit to how
much material a single person can keep up to date.
On newer uarchs, `rep movsb` isn't slower than `rep movsd`, and
often performs similar to the best SSE2 implementation (using NT
stores). See "BeeOnRope"'s answer to this StackOverflow question
for an in-depth discussion about this:
https://stackoverflow.com/questions/43343231/enhanced-rep-movsb-for-memcpy
AVX2 seems to offer extra performance beyond that, though, if it
is available (for example if runtime feature detection is used).
I believe I read a comment by Agner somewhere to that effect as
well – a search engine will certainly be able to turn up more.
— David
More information about the Digitalmars-d
mailing list