regex direct support for sse4 intrinsics
Tove
tove at fransson.se
Tue Mar 27 14:30:45 PDT 2012
On Tuesday, 27 March 2012 at 09:51:07 UTC, bearophile wrote:
> Dmitry Olshansky:
>
>> Speaking more of run-time version of regex, it is essentially
>> running a VM that executes instructions that do various kinds
>> of match-this, match-that. The VM dispatch code is quite slow,
>> the optimal _threaded_ code requires either doing it in
>> _assembly_ or _computed_ goto in the language. The VM
>> _dispatch_ takes up to 30% of time in the default matcher.
>
> I have used computed gotos in GCC-C to implement some quite
> efficient finite state machines to be used in computational
> biology. I've seen 20%+ speedups compared to my alternative
> switch-based implementation. So I'd like computed gotos in D
> too.
>
While I am in favor of all enhancements which improve low-level
access, I'm very surprised by your findings by computed gotos...
the compiler I am most used to(rvct for arm)... seems proficient
in emitting jump table instructions(TBB, TBH) for thumb2... but
based on your findings I will definitely re-check the generated
asm. Could it be that the compiler "heuristics" simply is less
than optimal... and an alternative would be to force a specific
implementation with a pragma? or the recent @annotation syntax...
pragma(switch, "jumptable")
pragma(switch, "binary-search-tree")
it would have the benefit of not having to re-factor the code and
one could easily benchmark which solution is the fastest for a
different inputs...
More information about the Digitalmars-d
mailing list