String compare performance
bearophile
bearophileHUGS at lycos.com
Sat Nov 27 19:08:29 PST 2010
I have done another test:
Timings, dmd compiler, best of 4, seconds:
D #1: 5.72
D #4: 1.84
D #5: 1.73
Psy: 1.59
D #2: 0.55
D #6: 0.47
D #3: 0.34
import std.file: read;
import std.c.stdio: printf;
int test(char[] data) {
int count;
foreach (i; 0 .. data.length - 3) {
char[] codon = data[i .. i + 3];
if ((codon.length == 3 && codon[0] == 'T' && codon[1] == 'A' && codon[2] == 'G') ||
(codon.length == 3 && codon[0] == 'T' && codon[1] == 'G' && codon[2] == 'A') ||
(codon.length == 3 && codon[0] == 'T' && codon[1] == 'A' && codon[2] == 'A'))
count++;
}
return count;
}
void main() {
char[] data0 = cast(char[])read("data.txt");
int n = 300;
char[] data = new char[data0.length * n];
for (size_t pos; pos < data.length; pos += data0.length)
data[pos .. pos+data0.length] = data0;
printf("%d\n", test(data));
}
So when there is to compare among strings known at compile-time to be small (like < 6 char), the comparison shall be replaced with inlined single char comparisons. This makes the code longer so it increases code cache pressure, but seeing how much slow the alternative is, I think it's an improvement.
(A smart compiler is even able to remove the codon.length==3 test because the slice data[i..i+3] is always of length 3 (here mysteriously if you remove those three length tests the program compiled with dmd gets slower)).
Bye,
bearophile
More information about the Digitalmars-d
mailing list