Code improvement for DNA reverse complement?
ag0aep6g via Digitalmars-d-learn
digitalmars-d-learn at puremagic.com
Mon May 22 03:35:36 PDT 2017
On 05/22/2017 10:58 AM, biocyberman wrote:
> @ag0aep6g
>> You fell into a trap there. The value is calculated at compile time,
>> but it has >copy/paste-like behavior. That is, whenever you use
>> `chars`, the code behaves as if you >typed out the array literal. That
>> means, the whole array is re-created on every iteration.
>
>> Use `static immutable` instead. It still forces compile-time
>> calculation, but it doesn't > have copy/paste behavior. Speeds up
>> revComp3 a lot.
>
> With 'iteration' here you mean running lifetime of the function, or in
> other words, each one of the 10_000 cycles in the benchmark?
For reference, here is the version of revComp3 I commented on:
----
string revComp3(string bps) {
const N = bps.length;
enum chars = [Repeat!('A'-'\0', '\0'), 'T',
Repeat!('C'-'A'-1, '\0'), 'G',
Repeat!('G'-'C'-1, '\0'), 'C',
Repeat!('T'-'G'-1, '\0'), 'A'];
char[] result = new char[N];
for (int i = 0; i < N; ++i) {
result[i] = chars[bps[N-i-1]];
}
return result.assumeUnique;
}
----
By "iteration" I mean every execution of the body of the `for` loop. For
every new `i`, a new array is created.
The loop above is equivalent to this:
----
for (int i = 0; i < N; ++i) {
result[i] = [Repeat!('A'-'\0', '\0'), 'T',
Repeat!('C'-'A'-1, '\0'), 'G',
Repeat!('G'-'C'-1, '\0'), 'C',
Repeat!('T'-'G'-1, '\0'), 'A'][bps[N-i-1]];
}
----
Used like that, the array literal
[Repeat!('A'-'\0', '\0'), 'T',
Repeat!('C'-'A'-1, '\0'), 'G',
Repeat!('G'-'C'-1, '\0'), 'C',
Repeat!('T'-'G'-1, '\0'), 'A']
allocates a new array on every execution of `result[i] = ...;`.
> Could you provide some more reading for what you are telling here? I can
> only guess it is intrinsic behavior of an 'enum'.
Unfortunately, the spec page (<https://dlang.org/spec/enum.html>)
doesn't seem to mention this.
But Ali Çehreli covers it in his book on the "immutability" page (I
would have expected to find it on the "enum" page):
http://ddili.org/ders/d.en/const_and_immutable.html#ix_const_and_immutable.enum
The details can be confusing here. There is an element of
copy/paste-like behavior, but it's not as simple as taking the
right-hand side of the enum declaration and substituting it for the
left-hand name.
The right-hand side is evaluated at compile time. The result of that can
be thought of as an array literal. It's that array literal that gets
substituted for the name.
An example with comments:
----
import std.stdio: writeln;
/* f prints a message when called at run time. Then it returns its
argument times ten. */
int f(int x)
{
if (!__ctfe) writeln("f(", x, ")");
return x * 10;
}
void main()
{
/* The following line prints f's messages. The f calls are normal
run-time calls. Then the writeln prints "false" because each array
literal creates a new, distinct array.
*/
writeln([f(1), f(2)] is [f(1), f(2)]); /* false */
/* The next `enum` line does not print f's messages. The calls go
through CTFE.
The `writeln` line afterwards prints "false". ea gets pre-computed
via CTFE, but the result acts like an array literal. So it's the
same as writing `writeln([10, 20] is [10, 20]);`.
*/
enum int[] ea = [f(1), f(2)];
writeln(ea is ea); /* false */
/* No messages either with `static immutable`. Like ea, the
right-hand side goes through CTFE.
But unlike ea, ia does not act like an array literal. `writeln`
prints "true".
*/
static immutable int[] ia = [f(1), f(2)];
writeln(ia is ia); /* true */
}
----
More information about the Digitalmars-d-learn
mailing list