how to count number of letters with std.algorithm.count / std.algorithm.reduce / std.algorithm.map ?

Sean Kelly sean at invisibleduck.org
Fri Nov 23 21:40:12 PST 2012


On Nov 16, 2012, at 7:49 AM, bioinfornatics <bioinfornatics at fedoraproject.org> wrote:

> hi,
> 
> I would like to count number of one ore more letter into a string or list of string (string[]) without use a for loop but instead using std.algorithm to compute efficiently.
> 
> if you have:
> string   seq1 = "ACGATCGATCGATCGCGCTAGCTAGCTAG";
> string[] seq2 = ["ACGATCGATCGATCGCGCTAGCTAGCTAG", "ACGATGACGATCGATGCTAGCTAG"];
> 
> i try :
> 
> reduce!( (seq) => seq.count("G"), seq.count("C"))(tuple(0LU,0LU),seq1)

D has map and reduce but not MapReduce, so this approach feels a bit unnatural.  Assuming ASCII characters and a reasonably sized sequence, here's the simplest approach:

        auto seq1 = cast(byte[])("ACGATCGATCGATCGCGCTAGCTAGCTAG".dup);

	foreach(e; group(sort(seq1))) {
		writefln("%s occurs %s times", cast(char) e[0], e[1]);
	}

For real code, the correct approach really depends on the number of discrete values, how dense the set of values is, and the total number of elements to evaluate.  For English letters the fastest result is likely an int[26].  For a more diverse set of input, a hash table.  For a huge input size, something like MapReduce is appropriate.


More information about the Digitalmars-d-learn mailing list