[Issue 16457] New: std.regex postprocesses at ctRegex every time at runtime

via Digitalmars-d-bugs digitalmars-d-bugs at puremagic.com
Wed Aug 31 16:32:56 PDT 2016


https://issues.dlang.org/show_bug.cgi?id=16457

          Issue ID: 16457
           Summary: std.regex postprocesses at ctRegex every time at
                    runtime
           Product: D
           Version: D2
          Hardware: x86_64
                OS: Linux
            Status: NEW
          Severity: enhancement
          Priority: P1
         Component: phobos
          Assignee: nobody at puremagic.com
          Reporter: greensunny12 at gmail.com

Consider the following program, it will crash as it allocates _a lot_ of memory
before the garbage can be collected. 

void main()
{
    import std.regex;
    enum re = ctRegex!(`((c)(s)?)?ti`);
    import core.memory : GC;
    string text = "Lorem ipsum dolor sit amet, consectetur adipiscing elit";

    foreach (i; 0..500_000_000)
    {
        if (auto m = matchAll(text, re)) {}
        //if (i % 1_000_000)
            //GC.collect();
    }
}

On my machine (16G) it crashes at about 5M iterations.
The GC profile finds two hotspots (here 2M iterations): 

     2048000000            2000000    uint[] D main
std/regex/internal/parser.d:1607
      184000000            2000000    std.regex.internal.ir.Bytecode[] D main
std/array.d:852

(the latter is insertInPlace)

After looking at the code it seems pretty weird, because "postprocess" should
be called while constructing the RegEx and once only.

--


More information about the Digitalmars-d-bugs mailing list