Code improvement for DNA reverse complement?

Fri May 19 05:21:10 PDT 2017

On Friday, 19 May 2017 at 09:17:04 UTC, Biotronic wrote:
> On Friday, 19 May 2017 at 07:29:44 UTC, biocyberman wrote:
>> [...]
>
> Question about your implementation: you assume the input may 
> contain newlines, but don't handle any other non-ACGT 
> characters. The problem definition states 'DNA string' and the 
> sample dataset contains no non-ACGT chars. Is this an oversight 
> my part or yours, or did you just decide to support more than 
> the problem requires?
>
> [...]

Firstly, thank you for showing me various solutions, and even 
cool benchmark code. To answer you questions: Yes I assume the 
input file would realistically contain newlines, even though the 
problem does not care about them. I also thought about non-CATG 
bases, but haven't taken care of those cases. In reality we 
should deal with at least ambiguous bases (N).

I ran your code and also see that switch is faster than AA (i.e. 
revComp0 is the fastest). And Stefan is right about this.

Some follow up questions:

1. Why do we need to use assumeUnique in 'revComp0' and 
'revComp3'?

2. What is going on with the trick of making chars enum like that 
in 'revComp3'?