Replacing tango.text.Ascii.isearch
rassoc
rassoc at posteo.de
Thu Oct 6 21:36:48 UTC 2022
On 10/5/22 23:50, torhu via Digitalmars-d-learn wrote:
> I did some basic testing, and regex was two orders of magnitude faster. So now I know, I guess.
And what kind of testing was that? Mind to share? Because I did the following real quick and wasn't able to measure a "two orders of magnitude" difference. Sure, the regex version came on top, but they were both faster than the ruby baseline I cooked up.
First, generate a word file with 100k entries of various lengths:
$> dmd -run words.d foobaz 100000
---
import std;
string randomWord(ulong n) {
static chars = letters.array;
return generate!(() => chars.choice).take(n).text;
}
void main(string[] args) {
enforce(args.length == 3, "Usage: dmd -run words.d needle num");
auto f = File("words.txt", "w");
foreach (i; 0..args[2].to!ulong) {
ulong n = uniform(0, 50), m = uniform(0, 50);
if (i % 2 == 0)
f.writeln(randomWord(n), args[1], randomWord(m));
else
f.writeln(randomWord(n + m));
}
}
---
And then for the actual measuring:
$> dmd -O -version={range,regex} -of=search-{range,regex} search.d
$> ldc -O -d-version={range,regex} -of=search-{range,regex}-ldc search.d
$> time ./search-{range,regex}{,-ldc} foobaz
---
import std;
void main(string[] args) {
enforce(args.length == 2, "Usage: search 'needle'");
version (regex)
auto rx = regex(args[1], "i");
else version (range)
auto needle = args[1].asLowerCase.text;
else
static assert(0, "use -version={regex,range}");
ulong matches;
foreach (line; File("words.txt").byLine) {
version (regex)
if (line.matchFirst(rx))
matches++;
version (range)
if (line.asLowerCase.canFind(needle))
matches++;
}
writeln(matches);
}
---
More information about the Digitalmars-d-learn
mailing list