[Issue 17161] [REG 2.072.2] Massive Regex Slowdown
via Digitalmars-d-bugs
digitalmars-d-bugs at puremagic.com
Thu Feb 9 12:32:26 PST 2017
https://issues.dlang.org/show_bug.cgi?id=17161
--- Comment #2 from Jack Stouffer <jack at jackstouffer.com> ---
Bad news: I see a similar performance decrease for run-time regex as well.
# 2.073.0
$ dmd -O -inline -release test2.d && cat input5000000.txt | time ./test2
./test2 4.44s user 0.09s system 98% cpu 4.591 total
# 2.072.2
~/digger/result/bin/dmd -O -inline -release test2.d && cat input5000000.txt |
time ./test2
./test2 3.20s user 0.09s system 98% cpu 3.344 total
I consistently get around a second and a half longer run time with 2.073.
Code
import std.algorithm;
import std.array;
import std.range;
import std.regex;
import std.stdio;
import std.typecons;
import std.utf;
static variants = [
"agggtaaa|tttaccct",
"[cgt]gggtaaa|tttaccc[acg]",
"a[act]ggtaaa|tttacc[agt]t",
"ag[act]gtaaa|tttac[agt]ct",
"agg[act]taaa|ttta[agt]cct",
"aggg[acg]aaa|ttt[cgt]ccct",
"agggt[cgt]aa|tt[acg]accct",
"agggta[cgt]a|t[acg]taccct",
"agggtaa[cgt]|[acg]ttaccct",
];
void main()
{
auto app = appender!string;
app.reserve(5_000_000);
app.put(stdin
.byLineCopy(KeepTerminator.yes)
.joiner
.byChar);
auto seq = app.data;
auto regexLineFeeds = regex(">.*\n|\n");
seq = seq.replaceAll(regexLineFeeds, "");
foreach (pattern; variants)
{
writeln(pattern, " ", seq.matchAll(pattern).walkLength);
}
}
--
More information about the Digitalmars-d-bugs
mailing list