dhtslib v0.12.0 (high-throughput sequencing library)
James Blachly
james.blachly at gmail.com
Wed Sep 1 05:27:38 UTC 2021
I'm delighted to finally post an official announcement of our package
for high-throughput sequencing (HTS), also called Next-generation
sequencing (NGS): `dhtslib`. It's not a very clever name, and we are
working on a new one. ;)
https://github.com/blachlylab/dhtslib/
https://code.dlang.org/packages/dhtslib
Once upon a time, BioD[1] was fairly active, but I am afraid D is not
heavily used in bioinformatics and computational biology, especially in
high-throughput (genome) sequencing applications when compared to its
peers.[2] However, our group (cancer genomics) has found D an ideal
language which is easy to pick up for Python programmers and yet retains
powerful features for C/C++ programmers.
`dhtslib` began as a thin wrapper over the ubiquitous, but very
low-level and hard to use `htslib` C library
(https://github.com/samtools/htslib/). We use `dhtslib` extensively in
both public and private projects for computational biology, and over the
years it has grown from simply a (huge) set of `extern (C)` definitions
to a fully featured, RAII-enabled genome sequencing focused
bioinformatics package. If you are working in this field, or know
someone open to D who works in this field, I strongly encourage you to
point them at `dhtslib`!
* `htslib` namespace with complete bindings to htslib
* `dhtslib` namespace with high level object-oriented interfaces, many
using underlying htslib calls for high performance, but via convenient
and idiomatic D including RAII, Forward ranges, etc.
* htslib-backed read/write of SAM/BAM/CRAM, VCF/BCF
* Readers for BED and GFF3/GTF (not part of htslib)
* FASTQ streamer
* CIGAR manipulations
The next version, v0.13.0, adds a novel feature "Typesafe Coordinates",
which I'll post about separately in a moment!
Kind regards
James S Blachly, MD
The Ohio State University
[0] https://github.com/blachlylab/dhtslib/
https://code.dlang.org/packages/dhtslib
[1] https://github.com/biod/BioD
[2] Here is a contemporary example of D used in high-throughput
sequencing: DENTIST by Arne Ludwig at Max Planck institute
https://github.com/a-ludi/dentist -- if you know of more, please
let me know!
More information about the Digitalmars-d-announce
mailing list