Code improvement for DNA reverse complement?

biocyberman via Digitalmars-d-learn digitalmars-d-learn at puremagic.com
Fri May 19 05:21:10 PDT 2017


On Friday, 19 May 2017 at 09:17:04 UTC, Biotronic wrote:
> On Friday, 19 May 2017 at 07:29:44 UTC, biocyberman wrote:
>> [...]
>
> Question about your implementation: you assume the input may 
> contain newlines, but don't handle any other non-ACGT 
> characters. The problem definition states 'DNA string' and the 
> sample dataset contains no non-ACGT chars. Is this an oversight 
> my part or yours, or did you just decide to support more than 
> the problem requires?
>
> [...]

Firstly, thank you for showing me various solutions, and even 
cool benchmark code. To answer you questions: Yes I assume the 
input file would realistically contain newlines, even though the 
problem does not care about them. I also thought about non-CATG 
bases, but haven't taken care of those cases. In reality we 
should deal with at least ambiguous bases (N).

I ran your code and also see that switch is faster than AA (i.e. 
revComp0 is the fastest). And Stefan is right about this.

Some follow up questions:

1. Why do we need to use assumeUnique in 'revComp0' and 
'revComp3'?

2. What is going on with the trick of making chars enum like that 
in 'revComp3'?





More information about the Digitalmars-d-learn mailing list