Fast linear search in D array (slice)

Per Nordlöw per.nordlow at gmail.com
Wed Nov 7 21:06:14 UTC 2018


I'm currently implementing overloads of some Phobos's search 
algorithms where the haystack is a `char[]` and the needle is an 
ASCII-clean `char`. I plan to integrate them to Phobos later down 
the road.

One such overload is at

https://github.com/nordlow/phobos-next/blob/master/src/find_split_ex.d#L10

which currently implements the linear search as

foreach (immutable offset; 0 .. haystack.length)
{
     const bool hit = haystack[offset] == needles[0];
     if (hit)
     {
         return Result(haystack, offset);
     }
}

I now have a couple questions regarding the performance of the 
above foreach-loop:

- Is this the preferred way of implementing a linear search in D?
- Andrei has mentioned that sentinel-based search can be faster 
if the haystack is allowed to be temporarily mutated at the end. 
But I presume such a solution is only faster above a certain 
length for the haystack. Has anybody benchmarked and found a 
suitable length limit for this?
- As a special-case of the above there's also the possibility of 
mutating the last character to be a null and then reusing 
`strchr`.
- When is it preferred to search in blocks of 2, 4, 8 or 16 
bytes? I presume for a certain alignment and, again, length limit.
- Would it be relevant to make such a function a compiler 
intrinsic that handles all these cases for the generic 
non-decoding case where haystack is a `T[]` and needle a `T`?


More information about the Digitalmars-d mailing list