Regex matching cause lots of _d_arrayliteralTX calls
H. S. Teoh
hsteoh at quickfur.ath.cx
Thu Sep 26 17:08:53 PDT 2013
On Fri, Sep 27, 2013 at 01:51:51AM +0200, JR wrote:
> On Thursday, 26 September 2013 at 23:04:22 UTC, bearophile wrote:
> >I am not sure how a IRC bot could consume more than a tiny
> >fraction of the CPU time of a modern multi-GHz processor.
>
> Nor does it bite into my 8 gigabytes of ram.
>
> Forgive me, but the main culprit in all of this is still me doing it
> wrong. Can I keep the same RegexMatcher (perhaps as a struct member)
> and reuse it between matchings?
Not sure what you mean, but are you compiling regexes every time you use
them? If so, you should be storing them instead, for example:
// Place to store precompiled regex matchers.
struct MyContext {
Regex!char pattern1;
Regex!char pattern2;
Regex!char pattern3;
...
}
// Presumably, this will run far more frequently than
// updatePatterns.
auto runMatches(MyContext ctxt, string message) {
if (message.match(ctxt.pattern1)) {
...
} else if (message.match(ctxt.pattern2)) {
...
}
...
}
// Presumably, this only runs once in a while, so you save on
// the cost of compiling/storing the regex every single time you
// run a match.
void updatePatterns(ref MyContext ctxt,
string newPattern1,
string newPattern2,
string newPattern3, ...)
{
ctxt.pattern1 = regex(newPattern1);
ctxt.pattern2 = regex(newPattern2);
ctxt.pattern3 = regex(newPattern3);
...
}
So when you need to update your regexes, say based on reloading a config
file or something, you'd run updatePatterns() to compile all the
patterns, then runMatches() can be used during the normal course of your
program. This should save on a lot of overhead.
Of course, if you have regexes that are fixed at compile-time, you could
use ctRegex to *really* speed things up. Or, if that makes your
compilation too slow (it does have a tendency of doing that), initialize
your patterns in a static this() block:
Regex!char predeterminedPattern1;
Regex!char predeterminedPattern2;
static this() {
predeterminedPattern1 = regex(`...`);
predeterminedPattern2 = regex(`...`);
}
...
void matchStuff(string message) {
if (message.match(preterminedPattern1)) {
...
}
...
}
> >And I am not sure if regular expressions are a good idea to
> >implement a IRC interface.
>
> I dare say I disagree!
Yeah, anything involving heavy string processing is probably best done
using regexes rather than ad hoc string slicing, which is bug-prone and
hard to maintain.
T
--
First Rule of History: History doesn't repeat itself -- historians merely repeat each other.
More information about the Digitalmars-d-learn
mailing list