[Issue 16457] std.regex postprocesses ctRegex every time at runtime

via Digitalmars-d-bugs digitalmars-d-bugs at puremagic.com
Sat Sep 3 12:51:57 PDT 2016


https://issues.dlang.org/show_bug.cgi?id=16457

Dmitry Olshansky <dmitry.olsh at gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |dmitry.olsh at gmail.com

--- Comment #1 from Dmitry Olshansky <dmitry.olsh at gmail.com> ---
(In reply to greensunny12 from comment #0)
> Consider the following program, it will crash as it allocates _a lot_ of
> memory before the garbage can be collected. 
> 
> void main()
> {
>     import std.regex;
>     enum re = ctRegex!(`((c)(s)?)?ti`);
>     import core.memory : GC;
>     string text = "Lorem ipsum dolor sit amet, consectetur adipiscing elit";
> 
>     foreach (i; 0..500_000_000)
>     {
>         if (auto m = matchAll(text, re)) {}
>         //if (i % 1_000_000)
>             //GC.collect();
>     }
> }
> 
> On my machine (16G) it crashes at about 5M iterations.
> The GC profile finds two hotspots (here 2M iterations): 
> 
>      2048000000	        2000000	uint[] D main
> std/regex/internal/parser.d:1607
>       184000000	        2000000	std.regex.internal.ir.Bytecode[] D main
> std/array.d:852
> 
> (the latter is insertInPlace)
> 
> After looking at the code it seems pretty weird, because "postprocess"
> should be called while constructing the RegEx and once only.

The problem is enum re = ... line. Enum means ctRegex!`((c)(s)?)?ti` is
copy-pasted at the place of usage, basically reconstructing regex on each loop
iteration b/c compile-time version can't cache compiled patterns.

Replace enum with static and all should be good.

--


More information about the Digitalmars-d-bugs mailing list