[Issue 12690] std.regex BacktrackingMatcher bmatch is faster than ThompsonMatcher but discouraged

via Digitalmars-d-bugs digitalmars-d-bugs at puremagic.com
Fri May 2 13:51:19 PDT 2014


https://issues.dlang.org/show_bug.cgi?id=12690

Dmitry Olshansky <dmitry.olsh at gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |dmitry.olsh at gmail.com

--- Comment #1 from Dmitry Olshansky <dmitry.olsh at gmail.com> ---
For the test case in question on Win32 I get:

//With bmatch
D:\D>dvt4a
Time=3 secs, 802 ms, 707 μs, and 7 hnsecs, matches per second=1577823

//With match
D:\D>dvt4b
Time=4 secs, 274 ms, 43 μs, and 7 hnsecs, matches per second=1403822

Which is around 10%. Note that proper flags for optimized build are:

dmd -O -release -inline


On ldc proper flags (IIRC) are:

ldc -O3 -release

--- 

I would not even consider result w/o inlining as showing anything. Please
compile your real-world program with the above flags.

---

On the test and results of such.

The provided test measures the overhead of starting an engine (memory
allocation/deallocation eats ~25% of total time in both cases). Overall indeed
BT has a bit less of work to do at start-up.

Generally it is known that BT matchers do have an advantage on patterns that
tend to match, the problem with them is low stability (small change in pattern
may change performance radically) and bad performance on ambiguous patterns. 

I imagine the problem you are solving involves a number of patterns that is
close to the number of haystacks. Even then a 10% slower start-up is no more
then ~1.21% of total time.

P.S. There is an enhancement that may be of interest:
https://issues.dlang.org/show_bug.cgi?id=12227

--


More information about the Digitalmars-d-bugs mailing list