[Issue 7471] New: Improve performance of std.regex

d-bugmail at puremagic.com d-bugmail at puremagic.com
Thu Feb 9 09:28:00 PST 2012


http://d.puremagic.com/issues/show_bug.cgi?id=7471

           Summary: Improve performance of std.regex
           Product: D
           Version: D2
          Platform: All
        OS/Version: All
            Status: NEW
          Severity: enhancement
          Priority: P2
         Component: Phobos
        AssignedTo: nobody at puremagic.com
        ReportedBy: Jesse.K.Phillips+D at gmail.com


--- Comment #0 from Jesse Phillips <Jesse.K.Phillips+D at gmail.com> 2012-02-09 09:27:58 PST ---
The previous implementation is said to do some caching of the last used engine.
english.dic is 134,950 entries for these timings.

Test code
----------
import std.file;
import std.string;
import std.datetime;
import std.regex;

private int[string] model;

void main() {
   auto name = "english.dic";
   foreach(w; std.file.readText(name).toLower.splitLines)
      model[w] += 1;

   foreach(w; std.string.split(readText(name)))
      if(!match(w, regex(r"\d")).empty)
      {}
      else if(!match(w, regex(r"\W")).empty)
      {}
}
-------

I'm trying to avoid the caching here, but still see better performance in
2.056. Actually I find these timings are with mingw on Windows. I find it odd
that user time is actually fast, but real time is the slow piece, does mingw
have access to the proper information?

$ time ./test2.056.exe

real    0m0.860s
user    0m0.047s
sys     0m0.000s

$ time ./test2.058.exe

real    0m55.500s
user    0m0.031s
sys     0m0.000s

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------


More information about the Digitalmars-d-bugs mailing list