Fast linear search in D array (slice)
Per Nordlöw
per.nordlow at gmail.com
Wed Nov 7 21:06:14 UTC 2018
I'm currently implementing overloads of some Phobos's search
algorithms where the haystack is a `char[]` and the needle is an
ASCII-clean `char`. I plan to integrate them to Phobos later down
the road.
One such overload is at
https://github.com/nordlow/phobos-next/blob/master/src/find_split_ex.d#L10
which currently implements the linear search as
foreach (immutable offset; 0 .. haystack.length)
{
const bool hit = haystack[offset] == needles[0];
if (hit)
{
return Result(haystack, offset);
}
}
I now have a couple questions regarding the performance of the
above foreach-loop:
- Is this the preferred way of implementing a linear search in D?
- Andrei has mentioned that sentinel-based search can be faster
if the haystack is allowed to be temporarily mutated at the end.
But I presume such a solution is only faster above a certain
length for the haystack. Has anybody benchmarked and found a
suitable length limit for this?
- As a special-case of the above there's also the possibility of
mutating the last character to be a null and then reusing
`strchr`.
- When is it preferred to search in blocks of 2, 4, 8 or 16
bytes? I presume for a certain alignment and, again, length limit.
- Would it be relevant to make such a function a compiler
intrinsic that handles all these cases for the generic
non-decoding case where haystack is a `T[]` and needle a `T`?
More information about the Digitalmars-d
mailing list