Dynamic array comparison optimization: check for equal pointers?

Johan j at j.nl
Wed Jun 3 23:19:34 UTC 2020


On Wednesday, 3 June 2020 at 13:46:00 UTC, Andrei Alexandrescu 
wrote:
> On 6/3/20 1:48 AM, Patrick Schluter wrote:
>> On Tuesday, 2 June 2020 at 19:38:04 UTC, Johan wrote:
>>>
>>> I am surprised that memcmp itself does not implement the 
>>> pointers are equal check. Is there a big impact of adding the 
>>> shortcircuit check to prevent scanning memory?
>> 
>> memcmp() doesn't implement the check to avoid memcmp(NULL, 
>> NULL, length) returning with 0 instead of segfaulting as it 
>> does if any of the 2 pointers in NULL.
>> It's a case of consistant behaviour, either you tolerate NULL 
>> pointers in all cases or you segfault in all cases.
>
> Affirmative:
>
> https://code.woboq.org/userspace/glibc/string/memcmp.c.html#301
>
> My take on the original matter is - adding an 
> easily-predictable branch is unlikely to affect speed in the 
> most naive speculative execution hardware. However, prediction 
> does take resources that would be otherwise available for other 
> branches, and the branch adds to code size. Both of these are 
> important in large applications. So inserting a test will look 
> good in microbenchmarks but may be actually a negative in 
> real-world use.

It's such a simple optimization idea that I think it must have 
been tested by many many people already; presumably the outcome 
was that the gain is not worth the cost.
In case the data is _not_ equal, I guess the inequality actually 
shows up quite early so only a fraction of memory is scanned.
In case the data _is_ equal, that's where pointer comparison can 
cut a significant amount of memory read, but perhaps in most 
cases the data being equal is still stored in different memory 
locations.
Again, I'm very interested to see a perf comparison in a real 
application. Perhaps I can test this at Weka.

-Johan



More information about the Digitalmars-d mailing list