[OT] Application case study comparing Java, Go, and C++

Fri Mar 1 18:37:33 UTC 2019

On Friday, 1 March 2019 at 12:46:12 UTC, Pjotr Prins wrote:
> On Thursday, 28 February 2019 at 23:50:44 UTC, Jon Degenhardt 
> wrote:
>> On Thursday, 28 February 2019 at 22:58:54 UTC, Seb wrote:
>>> On Thursday, 28 February 2019 at 20:48:01 UTC, Jon Degenhardt 
>>> wrote:
>>>> [...]
>>>
>>> I wouldn't give much value to this paper. It hasn't been peer 
>>> reviewed and I doubt it would pass any. A quick example:
>>>
>>> "It [their tool] can be used as a drop-in replacement for 
>>> many operations implemented by SAMtools [...]". Though no 
>>> performance comparison was done against samtools (nor any 
>>> other tools expect their own implementations). I find this 
>>> pretty shocking, because their entire paper's purpose is 
>>> about performance...
>>>
>>> For reference, samtools is the de-facto standard for a reason 
>>> (yes it's old and written in C).
>>>
>>> Though, to be fair sambamba (written in D) is faster than the 
>>> C "standard" implementation:
>>>
>>> https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4765878
>>
>> They do have benchmark comparisons against GATK 4 in another 
>> paper:
>> "elPrep 4: A multithreaded framework for sequence analysis"
>> https://doi.org/10.1371/journal.pone.0209523
>>
>> I'm not so familiar with these tool sets. How does GATK 4 
>> stack up against other tools?
>>
>> From the paper it looks like many of the performance gains 
>> over GATK 4 resulted from architecture and algorithm changes, 
>> so it may not be valid from the perspective of comparing 
>> C++/Go/Java and GC vs reference counting.
>
> As the co-author of sambamba and having a pretty good 
> understanding of samtools I call BS on mentioned Go/C++/Java 
> comparison paper. It is all about implementation, i.e., the 
> programmer. Saying that Go is faster than C++ makes no sense to 
> me (go figure). Maybe the C++ implementation should have used a 
> ring buffer like Sambamba does in D (Artem did the smart thing).
>
> One reason I like chess is that it is an honest comparison of 
> skill. Have two people play and you can tell quickly who is 
> superior. In computing we don't have such an easy framework. 
> You can compare tools, i.e., implementation, but to make it a 
> language comparison is bound to be flawed. The problem with 
> that comparision paper is the way they wrote it up.

Thanks for the feedback (both Seb and Pjotr).

It's too bad the paper doesn't provide more meaningful value, as 
application level comparisons of alternate programming 
environments are quite rare. Application level benchmarks are 
useful in conjunction with the micro-benchmarks that are more the 
norm. More important, in my view. But, if the work isn't well 
founded, or least can't be shown to be well founded, then it's 
not useful. If there were a number of similar results it might be 
seen as contributing evidence. As a single work it'd always need 
to be viewed skeptically, but if people who have expertise in the 
application area don't find it worthy, well...

--Jon