[OT] Application case study comparing Java, Go, and C++
Jon Degenhardt
jond at noreply.com
Thu Feb 28 23:50:44 UTC 2019
On Thursday, 28 February 2019 at 22:58:54 UTC, Seb wrote:
> On Thursday, 28 February 2019 at 20:48:01 UTC, Jon Degenhardt
> wrote:
>> This paper may be of interest to people here:
>>
>> "A comparison of three programming languages for a
>> full-fledged next-generation sequencing tool", P.Costanza,
>> C.Herzeel, W.Verachrert
>> https://doi.org/10.1101/558056
>>
>> The paper compares implementations of a tool operating on
>> SAM/BAM files (bioinformatics) from a performance perspective.
>> Focus is on comparison of GC schemes used in Go and Java with
>> reference counting in C++. The GC schemes were materially
>> faster.
>>
>> I'm not familiar with the authors or the implementations, so
>> cannot say how well the implementations were done. However, it
>> appears to be a useful case study, and the authors go provide
>> a fair bit of analysis in the paper.
>>
>> There's a reddit thread also:
>> https://www.reddit.com/r/programming/comments/avsfc6/performance_comparison_of_go_c_and_java_for/
>
> I wouldn't give much value to this paper. It hasn't been peer
> reviewed and I doubt it would pass any. A quick example:
>
> "It [their tool] can be used as a drop-in replacement for many
> operations implemented by SAMtools [...]". Though no
> performance comparison was done against samtools (nor any other
> tools expect their own implementations). I find this pretty
> shocking, because their entire paper's purpose is about
> performance...
>
> For reference, samtools is the de-facto standard for a reason
> (yes it's old and written in C).
>
> Though, to be fair sambamba (written in D) is faster than the C
> "standard" implementation:
>
> https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4765878
They do have benchmark comparisons against GATK 4 in another
paper:
"elPrep 4: A multithreaded framework for sequence analysis"
https://doi.org/10.1371/journal.pone.0209523
I'm not so familiar with these tool sets. How does GATK 4 stack
up against other tools?
From the paper it looks like many of the performance gains over
GATK 4 resulted from architecture and algorithm changes, so it
may not be valid from the perspective of comparing C++/Go/Java
and GC vs reference counting.
More information about the Digitalmars-d
mailing list