core.traits?

H. S. Teoh hsteoh at quickfur.ath.cx
Wed Jan 9 17:40:38 UTC 2019


On Wed, Jan 09, 2019 at 12:31:13PM +0000, Patrick Schluter via Digitalmars-d wrote:
[...]
> This said, another issue with memcpy that very often gets lost is
> that, because of the fancy benchmarking, its system performance cost
> is often wrongly assessed, and a lot of heroic efforts are put in
> optimizing big block transfers, while in reality it's mostly called on
> small (postblit) to medium blocks.

EXACTLY!!!

Some time ago I took an interest in implementing the equivalent of
strchr in the most optimized way possible. For that, I wrote several of
my own algorithms and also perused the glibc implementation.

Eventually, I realized that the glibc implementation, which uses fancy
64-bit-word scanning with a lot of setup overhead and messy
starting/trailing cases, is optimizing for very large scans, i.e., when
the byte being sought occurs only rarely in a very large haystack.  In
those cases it's at the top of benchmarks.  However, in the arguably
more common case where the byte being sought occurs relatively
frequently in small- to medium-sized haystacks, repeatedly searching the
haystack incurs a ton of overhead setting up all that fancy machinery,
branch hazards, and what-not, where a plain ole `while (*ptr++ !=
needle) {}` works much better.

I suspect many of the C library functions of this sort (incl. memcpy +
friends) have a tendency to suffer from this sort of premature
optimization.

Not to mention that often overly-specialized benchmarks of this sort
fail to account for bias caused by the CPU's branch predictor learning
the benchmark and the cache hierarchy amortizing the cost of repeatedly
searching the same haystack -- things you rarely do in real-life
applications.  There's a big risk of your "super-optimized" algorithm
ending up optimizing for an unrealistic use-case, but having only
mediocre or sometimes even poor performance in real-world computations.


> Linus Torvalds had once a rant on that subject on realworldtech.
> https://www.realworldtech.com/forum/?threadid=168200&curpostid=168589

Nice.


T

-- 
If the comments and the code disagree, it's likely that *both* are wrong. -- Christopher


More information about the Digitalmars-d mailing list