faster splitter

Tue May 24 03:44:12 PDT 2016

On Tuesday, 24 May 2016 at 09:34:30 UTC, Chris wrote:
> On Tuesday, 24 May 2016 at 07:54:38 UTC, qznc wrote:
>> Apart from advanced algorithms, find should not be slower than 
>> a naive nested-for-loop implementation.
>>
>> [0] https://issues.dlang.org/show_bug.cgi?id=16066
>
> From Phobos [1]:
>
>         /* Preprocess #2: init skip[] */
>         /* Note: This step could be made a lot faster.
>         * A simple implementation is shown here. */
>         this.skip = new size_t[needle.length];
>         foreach (a; 0 .. needle.length)
>         {
>             size_t value = 0;
>             while (value < needle.length
>                    && !needlematch(needle, a, value))
>             {
>                 ++value;
>             }
>             this.skip[needle.length - a - 1] = value;
>         }
>
> [1] 
> https://github.com/dlang/phobos/blob/master/std/algorithm/searching.d#L335

Unfortunately, it is slower. My current measurements [0]:

$ ./benchmark 10000000 10 # large haystack, few iterations
std find    took    753837231
manual find took    129561980
$ ./benchmark 10 10000000 # small haystack, many iterations
std find    took    721676721
manual find took    216418870

The nested-for-loop implementation is roughly 4 times faster!

Caveat: I'm not completely sure my benchmark is fair yet.

[0] https://github.com/qznc/find-is-too-slow