faster splitter
Andrei Alexandrescu via Digitalmars-d
digitalmars-d at puremagic.com
Mon May 30 11:20:39 PDT 2016
On 05/30/2016 05:31 AM, qznc wrote:
> On Sunday, 29 May 2016 at 21:07:21 UTC, qznc wrote:
>> When looking at the assembly I don't like the single-byte loads. Since
>> string (ubyte[] here) is of extraordinary importance, it should be
>> worthwhile to use word loads [0] instead. Really fancy would be SSE.
>
> So far, the results look disappointing. Andrei find does not get faster
> with wordwise matching:
>
> ./benchmark.ldc
> std find: 133 ±25 +38 (3384) -19 (6486)
> manual find: 140 ±37 +64 (2953) -25 (6962)
> qznc find: 114 ±17 +33 (2610) -11 (7262)
> Chris find: 146 ±39 +66 (3045) -28 (6873)
> Andrei find: 126 ±29 +54 (2720) -19 (7189)
> Wordwise find: 130 ±30 +53 (2934) -21 (6980)
>
> Interesting side-note: On my laptop Andrei find is faster than qznc find
> (for LDC), but on my desktop it reverses (see above). Both are Intel i7.
> Need to find a simpler processor. Maybe wordwise is faster there.
> Alternatively, find is purely memory bound and the L1 cache makes every
> difference disappear.
>
> Also, note how std find is faster than manual find! Finding a reliable
> benchmark is hard. :/
Please throw this hat into the ring as well: it should improve average
search on large vocabulary dramatically.
https://dpaste.dzfl.pl/dc8dc6e1eb53
It uses a BM-inspired trick - once the last characters matched, if the
match subsequently fails it needn't start from the next character in the
haystack. The "skip" is computed lazily and in a separate function so as
to keep the loop tight. All in all a routine worth a look. I wanted to
write this for a long time. -- Andrei
More information about the Digitalmars-d
mailing list