Dicebot on leaving D: It is anarchy driven development in all its glory.

Jon Degenhardt jond at noreply.com
Sun Aug 26 19:04:21 UTC 2018


On Sunday, 26 August 2018 at 05:55:47 UTC, Pjotr Prins wrote:
> Artem wrote Sambamba as a student
>
>     https://github.com/biod/sambamba
>
> and it is now running around the world in sequencing centers. 
> Many many CPU hours and a resulting huge carbon foot print. The 
> large competing C++ samtools project has been trying for 8 
> years to catch up with an almost unchanged student project and 
> they are still slower in many cases.
> 
> [snip]
>
> Note that Artem used the GC and only took GC out for critical 
> sections in parallel code. I don't buy these complaints about 
> GC.
>
> The complaints about breaking code I don't see that much 
> either. Sambamba pretty much kept compiling over the years and 
> with LDC/LLVM latest we see a 20% perfomance increase. For free 
> (at least from our perspective). Kudos to LDC/LLVM efforts!!

This sounds very similar to my experiences with the tsv 
utilities, on most of the same points (development simplicity, 
comparative performance, GC use, LDC). Data processing apps may 
well be a sweet spot. See my DConf talk for an overview 
(https://github.com/eBay/tsv-utils/blob/master/docs/dconf2018.pdf).

Though not mentioned in the talk, I also haven't had any 
significant issues with new compiler releases. May have be 
related to the type of code being written. Regarding the GC - The 
throughput oriented nature of data processing tools like the tsv 
utilities looks like a very good fit for the current GC. 
Applications where low GC latency is needed may have different 
results. It'd be great to hear an experience report from 
development of an application where GC was used and low GC 
latency was a priority.

--Jon


More information about the Digitalmars-d mailing list