Why is std.regex slow, well here is one reason!

Richard (Rikki) Andrew Cattermole richard at cattermole.co.nz
Fri Feb 24 13:07:44 UTC 2023


Okay I have found something that can be improved very easily!

std.regex.internal.parser:

Inside of Parser struct:

```d
     @trusted void error(string msg)
     {
         import std.array : appender;
         import std.format.write : formattedWrite;
         auto app = appender!string();

         app ~= msg;
         app ~= "\nPattern with error: `";
         app ~= origin[0..$-pat.length];
         app ~= "` <--HERE-- `";
         app ~= pat;
         app ~= "`";

         throw new RegexException(app.data);
     }
```

That'll cut out ~100ms by removing formattedWrite!


Oooo ``static immutable CharMatcher matcher = 
CharMatcher(wordCharacter);`` is causing 541ms of slowness. And that 
line isn't showing up in the profile, so that could be improved, we need 
to also have CTFE initialization of say globals in it.

Next big jump for the above:

```d
@property auto wordMatcher()()
{
     return CharMatcher(unicode.Alphabetic | unicode.Mn | unicode.Mc
         | unicode.Me | unicode.Nd | unicode.Pc);
}
```

Add some pure annotations to CharMatcher and BitTable constructors.

These two things take out a good 700ms!

Looks like constructors are not showing up at all. KickStart from 
std.regex.internal.kickstart is not showing up for postprocess from 
std.regex.internal.parser. Not that we can do anything there just by 
simply removing stuff (text call shows up but it doesn't benefit much).

Okay looks like I'm at the 62ms mark. There is certainly more things to 
do but its starting to get into premature optimize territory 
individually, I'll do a PR for the above sets of changes.


More information about the Digitalmars-d mailing list