faster splitter
Andrei Alexandrescu via Digitalmars-d
digitalmars-d at puremagic.com
Tue May 24 05:38:04 PDT 2016
On 05/24/2016 06:44 AM, qznc wrote:
> On Tuesday, 24 May 2016 at 09:34:30 UTC, Chris wrote:
>> On Tuesday, 24 May 2016 at 07:54:38 UTC, qznc wrote:
>>> Apart from advanced algorithms, find should not be slower than a
>>> naive nested-for-loop implementation.
>>>
>>> [0] https://issues.dlang.org/show_bug.cgi?id=16066
>>
>> From Phobos [1]:
>>
>> /* Preprocess #2: init skip[] */
>> /* Note: This step could be made a lot faster.
>> * A simple implementation is shown here. */
>> this.skip = new size_t[needle.length];
>> foreach (a; 0 .. needle.length)
>> {
>> size_t value = 0;
>> while (value < needle.length
>> && !needlematch(needle, a, value))
>> {
>> ++value;
>> }
>> this.skip[needle.length - a - 1] = value;
>> }
>>
>> [1]
>> https://github.com/dlang/phobos/blob/master/std/algorithm/searching.d#L335
>>
>
> Unfortunately, it is slower. My current measurements [0]:
>
> $ ./benchmark 10000000 10 # large haystack, few iterations
> std find took 753837231
> manual find took 129561980
> $ ./benchmark 10 10000000 # small haystack, many iterations
> std find took 721676721
> manual find took 216418870
>
> The nested-for-loop implementation is roughly 4 times faster!
>
> Caveat: I'm not completely sure my benchmark is fair yet.
>
> [0] https://github.com/qznc/find-is-too-slow
One somewhat unrelated thing that can be done with Boyer-Moore: use a
constant-length skip array instead of dynamic allocation. Upon
saturation, continue with brute force. -- Andrei
More information about the Digitalmars-d
mailing list