Fast linear search in D array (slice)

Stanislav Blinov stanislav.blinov at gmail.com
Thu Nov 8 01:47:05 UTC 2018


On Wednesday, 7 November 2018 at 21:06:14 UTC, Per Nordlöw wrote:

> - Andrei has mentioned that sentinel-based search can be faster 
> if the haystack is allowed to be temporarily mutated at the 
> end. But I presume such a solution is only faster above a 
> certain length for the haystack. Has anybody benchmarked and 
> found a suitable length limit for this?

Speaking about x86(64), when your haystack fits in the cache 
entirely and is already in it, you'd be hard pressed to beat 
linear search. If not, a naive sentinel search is more likely to 
*degrade* performance, unless needle tends to appear in the last 
cache line worth of data. That's because you'll be performing 
unnecessary memory accesses. A sentinel search by cache line may 
be a better idea.
So there isn't a particular length limit. When performance is 
key, your program has to decide which approach to take based on 
your data layout and access patterns. A single generic library 
function can't do that.

> - As a special-case of the above there's also the possibility 
> of mutating the last character to be a null and then reusing 
> `strchr`.

D libraries should reduce dependencies on C libraries, not breed 
them.


More information about the Digitalmars-d mailing list