faster splitter
Chris via Digitalmars-d
digitalmars-d at puremagic.com
Sat May 28 07:18:10 PDT 2016
On Saturday, 28 May 2016 at 12:47:59 UTC, Andrei Alexandrescu
wrote:
> On 5/28/16 6:56 AM, qznc wrote:
>> The sentinel value is `needleBack+1`, but range elements need
>> not
>> support addition. Finding a sentinel is hard and most
>> certainly requires
>> more assumptions about the ranges.
>
> No need for a sentinel really so long as you first search for
> the last element of the needle, in the haystack.
>
> Allow me to put on the table a simple brute force routine,
> specialized for arrays with copyable elements, no custom
> predicate etc. Far as I can tell it does no redundant work:
>
> T[] find(T)(T[] haystack, T[] needle)
> {
> if (needle.length == 0) return haystack;
> immutable lastIndex = needle.length - 1;
> auto last = needle[lastIndex];
> size_t j = lastIndex;
> for (; j < haystack.length; ++j)
> {
> if (haystack[j] != last) continue;
> immutable k = j - lastIndex;
> // last elements match
> for (size_t i = 0; ; ++i)
> {
> if (i == lastIndex) return haystack[k .. $];
> if (needle[i] != haystack[k + i]) break;
> }
> }
> return haystack[$ .. $];
> }
>
> unittest
> {
> string s1 = "hello abc world";
> assert(find(s1, "abc") == "abc world");
> assert(find(s1, "def") == "");
> }
>
> void main(){}
>
>
> Andrei
I included your function (I also fixed the bug in my
findStringS_). Here's a typical result:
std find: 208 ±61
manual find: 127 ±29
qznc find: 108 ±13
Chris find: 140 ±32
Andrei find: 126 ±27
=====
std find: 207 ±59
manual find: 128 ±30
qznc find: 108 ±13
Chris find: 141 ±32
Andrei find: 125 ±27
I used dmd, because I don't have ldc on my laptop. qznc's find is
clearly the winner.
More information about the Digitalmars-d
mailing list