Announcement and Request: Typesafe Coordinate Systems for High-Throughput Sequencing Applications

James Blachly james.blachly at gmail.com
Wed Sep 1 05:36:53 UTC 2021


In another post, I've just announced our D-based high throughput 
sequencing library, dhtslib.

One feature that is, AFAIK, novel in the field is leveraging the 
compiler's type system to enforce correctness regarding different 
genome/reference sequence coordinate systems. Clearly, the encoding of 
domain specific knowledge in a language's type system is nothing new, 
but it is surprising that this has not been done before in 
bioinformatics, and it is an idea that IMO is long overdue given the 
trainwreck of different coordinate systems in our field.

You can find dhtslib's develop branch, with Typesafe Coordinates merged 
and ready to use, here:

https://github.com/blachlylab/dhtslib/


**Now the request:**
We've drafted a manuscript describing Typesafe Coordinates as a sort of 
low-key endorsement of the D language and our library package `dhtslib`. 
You can find the manuscript here:

https://github.com/blachlylab/typesafe-coordinates/

We would be very grateful to those of you who would take the time to 
read the manuscript and post comments (publicly or privately), 
_especially if we have made any incorrect statements_ or our language 
regarding type systems is awkward or nonstandard.

We did praise D, and gently criticized Rust and OCaml* somewhat as it 
appeared to me that they lacked the features required to implement 
Typesafe Coordinate Systems in as ergonomic a way as we could in D. 
However, being a true novice at both of these other languages there is 
the possibility that I've missed something significant, and that the 
Rust and OCaml implementations could be retooled to match the D 
implementation. I'd still be glad to hear it if that's the case.

I plan to make a few minor cleanups and submit this to a preprint server 
as well as a scientific journal in the next week or so.

Kind regards

James S Blachly, MD
The Ohio State University


* as a side note, I actually find the OCaml code quite attractive in its 
terseness: `let j = cl_interval_of_ho (ob_interval_of_zb i)`


More information about the Digitalmars-d-announce mailing list