D 1.0: std.regexp incredibly slow!

Steven Schveighoffer schveiguy at yahoo.com
Tue Sep 22 10:28:07 PDT 2009


On Tue, 22 Sep 2009 11:55:53 -0400, Markus Dangl <danglm at in.tum.de> wrote:

> Hi *,
>
> i stumbled on what seems to be a bug in std.regexp: It is incredibly
> slow using the following pattern:
> RegExp("^\\s+(\\d+)\\s+(\\d+)\\s+\\w+\\s+(\\w+)\\s+\\S+\\s+\\S+\\s+\\S+\\s+\\S+\\s+\\S+\\s+(.*)\r?\n?$")
>
> I don't really get the regexp code, so i can't debug it myself, but i
> have a PHP (!!) script that executes the same regexp in milliseconds.
>
> I attached code to test it, can someone please confirm?
>
> Thanks,
> Markus
>
> PS: Is there a quick way to fix this or are there bindings for other
> RegExp libs that i can use (Linux and Windows required) - i need to fix
> my program soon :) atm i'm looking for workarounds (splitting it into
> small regexps).

This is a common problem with some regex designs.  Java has (or had) the  
same problem.  I don't know if its fixable, you may want to try Tango's  
regex package.

-Steve



More information about the Digitalmars-d mailing list