[Issue 21716] New: std.regex performance regression (additional GC allocation)

d-bugmail at puremagic.com d-bugmail at puremagic.com
Mon Mar 15 02:34:23 UTC 2021


https://issues.dlang.org/show_bug.cgi?id=21716

          Issue ID: 21716
           Summary: std.regex performance regression (additional GC
                    allocation)
           Product: D
           Version: D2
          Hardware: x86
                OS: Mac OS X
            Status: NEW
          Severity: enhancement
          Priority: P1
         Component: phobos
          Assignee: nobody at puremagic.com
          Reporter: jrdemail2000-dlang at yahoo.com

I have observed a regex related performance regression my tsv-utils package
when switching from LDC 1.24.0 to LDC 1.25.0. This corresponds to a performance
regression in DMD 2.095.

Mostly likely cause is Phobos PR #7678:
https://github.com/dlang/phobos/pull/7678.

A possible cause was identified by Petar Kirov in the issue comments
(https://github.com/dlang/phobos/pull/7678#issuecomment-787814712):

> At first glance, the suspect could be the two new delegates
> defaultFactoryImpl and matchOnceImpl. They may or may not
> cause GC allocations. If they are replaced with static nested
> functions, they would be guaranteed not to cause GC allocations.

Checking GC stats using "--DRT-gcopt=profile:1" indicates increased GC in the
program:

With DMD 2.094.2: 

> Number of collections:  0
> 	Total GC prep time:  0 milliseconds
>  	Total mark time:  0 milliseconds
> 	Total sweep time:  0 milliseconds
> 	Max Pause Time:  0 milliseconds
> 	Grand total GC time:  0 milliseconds
> GC summary:    5 MB,    0 GC    0 ms, Pauses    0 ms <    0 ms

With DMD 2.095.1

> 	Number of collections:  672
> 	Total GC prep time:  7 milliseconds
> 	Total mark time:  62 milliseconds
> 	Total sweep time:  7 milliseconds
> 	Max Pause Time:  1 milliseconds
> 	Grand total GC time:  76 milliseconds
> GC summary:    5 MB,  672 GC   76 ms, Pauses   69 ms <    1 ms

The regular expression used in the tests is '[RD].*ION[0-2]', run against 14
million lines (one invocation per line). It's one of my standard benchmarks.

I am investigating further.

--


More information about the Digitalmars-d-bugs mailing list