Bad performance of simple regular expression - why??
Benji Smith
dlanguage at benjismith.net
Fri Feb 9 10:11:28 PST 2007
Jascha Wetzel wrote:
> Manfred Nowak wrote:
>> Also for maximal munching several regular expression there is no need
>> for backtracking.
>
> maximal munch, non-greedy matching and negative matches would usually
> use backtracking. doing it without backtracking would require to use
> several DFAs - or is there a simpler method?
>
> afaik, the "school-book-way" for backtracking are deterministic pushdown
> automatons (PDAs) that can also be interpreted efficiently or compiled
> to bytecode.
> so i don't see why there shouldn't be backtracking in the regexp
> implementation.
The problem is that most regex engines (in fact, all of them, as far as
I know) use backtracking even when a DFA implementation could be used to
determine (much earlier in the search process) that a match is impossible.
A backtracking approach might be necessary with positive and negative
lookaround assertions (though I still suspect that a DFA could even be
used for these cases), but every mainstream regex engine uses a
backtracking approach even for simple choice situations like (cat|car)
where a DFA would be much more appropriate.
And, by the way, it's possible to create a DFA implementation that
degrades into a backtracking matcher for certain subexpressions.
--benji
More information about the Digitalmars-d
mailing list