Replacing tango.text.Ascii.isearch

rassoc rassoc at posteo.de
Thu Oct 6 21:36:48 UTC 2022


On 10/5/22 23:50, torhu via Digitalmars-d-learn wrote:
> I did some basic testing, and regex was two orders of magnitude faster. So now I know, I guess.

And what kind of testing was that? Mind to share? Because I did the following real quick and wasn't able to measure a "two orders of magnitude" difference. Sure, the regex version came on top, but they were both faster than the ruby baseline I cooked up.

First, generate a word file with 100k entries of various lengths:

$> dmd -run words.d foobaz 100000
---
import std;

string randomWord(ulong n) {
     static chars = letters.array;
     return generate!(() => chars.choice).take(n).text;
}

void main(string[] args) {
     enforce(args.length == 3, "Usage: dmd -run words.d needle num");

     auto f = File("words.txt", "w");
     foreach (i; 0..args[2].to!ulong) {
         ulong n = uniform(0, 50), m = uniform(0, 50);
         if (i % 2 == 0)
             f.writeln(randomWord(n), args[1], randomWord(m));
         else
             f.writeln(randomWord(n + m));
     }
}
---

And then for the actual measuring:

$> dmd -O -version={range,regex} -of=search-{range,regex} search.d
$> ldc -O -d-version={range,regex} -of=search-{range,regex}-ldc search.d
$> time ./search-{range,regex}{,-ldc} foobaz
---
import std;

void main(string[] args) {
     enforce(args.length == 2, "Usage: search 'needle'");

     version (regex)
         auto rx = regex(args[1], "i");
     else version (range)
         auto needle = args[1].asLowerCase.text;
     else
         static assert(0, "use -version={regex,range}");

     ulong matches;
     foreach (line; File("words.txt").byLine) {
         version (regex)
             if (line.matchFirst(rx))
                 matches++;
         version (range)
             if (line.asLowerCase.canFind(needle))
                 matches++;
     }
     writeln(matches);
}
---


More information about the Digitalmars-d-learn mailing list