Investigation: downsides of being generic and correct

Jonathan M Davis jmdavisProg at gmx.com
Thu May 16 12:15:40 PDT 2013


On Thursday, May 16, 2013 12:35:11 Dicebot wrote:
> Want to bring into discussion people that are not on Google+.
> Samuel recently has posted there some simple experiments with
> bioinformatics and bad performance of Phobos-based snippet has
> surprised me.
> 
> I did explore issue a bit and reported results in a blog post
> (snippets are really small and simple) :
> http://dicebot.blogspot.com/2013/05/short-performance-tuning-story.html
> 
> One open question remains though - can D/Phobos do better here?
> Can some changes be done to Phobos functions in question to
> improve performance or creating bioinformatics-specialized
> library is only practical solution?

1. In general, if you want to operate on ASCII, and you want your code to be 
fast, use immutable(ubyte)[], not immutable(char)[]. Obviously, that's not 
gonig to work in this case, because the function is in std.string, but maybe 
that's a reason for some std.string functions to have ubyte overloads which 
are ASCII-specific.

2. We actually discussed removing all of the pattern stuff completely and 
replacing it with regexes (which is why countchars doesn't follow Phobos' 
naming scheme correctly - I left the pattern-using functions alone). However, 
that requires that someone who is appropriately familiar with regexes go and 
implement new versions of all of these functions which use std.regex. It 
should definitely be done, but no one has taken the time to do so yet.

3. While some functions in Phobos are well-optimized, there are plenty of them 
which aren't. They do the job, but no one has taken the time to optimize their 
implementations. This should be fixed, but again, it requires that someone 
spends the time to do the optimizations, and while that has been done for some 
functions, it definitely hasn't been done for all. And if python is faster than 
D at something, odds are that either the code in question is poorly written or 
that whatever Phobos functions it's using haven't been properly optimized yet.

- Jonathan M Davis


More information about the Digitalmars-d mailing list