Announcement and Request: Typesafe Coordinate Systems for High-Throughput Sequencing Applications

James Blachly james.blachly at gmail.com
Thu Sep 2 02:22:39 UTC 2021


On 9/1/21 5:01 AM, Arne Ludwig wrote:
> I am happy to hear of your latest idea of creating type-safe coordinate 
> systems. It's a great idea!
> 
> After reading the code on GitHub, I have only one major remark: IMHO, it 
> would be great to separate the novel coordinates systems from any 
> `htslib` dependencies ([see lines 
> 47-50](https://github.com/blachlylab/dhtslib/blob/e3b5af14e9eefa54bcc27bc0fcc9066dc3a4ea54/source/dhtslib/coordinates.d#L47-L50)) 
> as there are only auxiliary functions that use both the novel 
> coordinates systems and `htslib`. The greater goal I have in mind is to 
> provide the coordinate systems in a separate DUB sub-package (e.g. 
> `dhtslib:coordinates`) that requires only a D compiler. That makes 
> integration into existing projects that do not need `htslib` much easier.

This is an absolutely **outstanding** idea. Those imports were only to 
reuse an htslib `chr:X-Y` string parsing function, but we can trivially 
rewrite this in native D to enable sub-package independence!

> Also, I have a short list of minor, technical remarks:
> 
> 1. The returned type in [line 
> 114](https://github.com/blachlylab/dhtslib/blob/e3b5af14e9eefa54bcc27bc0fcc9066dc3a4ea54/source/dhtslib/coordinates.d#L114) 
> has a typo, there is an additional 's'.

Ahh, the curse of templates. Without 100% test coverage these things 
which would cause failure to compile in non-template code seem to always 
sneak in. Thank you so much.

> 2. The array of identifiers `CoordSystemLabels` in [line 
> 203](https://github.com/blachlylab/dhtslib/blob/e3b5af14e9eefa54bcc27bc0fcc9066dc3a4ea54/source/dhtslib/coordinates.d#L203) 
> is a bit unsafe and not strictly required for two reasons:

A very excellent suggestion. I am still a metaprogramming novice.

> 3. The function `unionImpl` in [line 
> 326](https://github.com/blachlylab/dhtslib/blob/e3b5af14e9eefa54bcc27bc0fcc9066dc3a4ea54/source/dhtslib/coordinates.d#L326) 
> actually computes the convex hull of the two intervals which should be 
> noted in the doc comment for completeness' sake.

Yes, we had some internal debate about the appropriate result of both 
union and intersect operations when intervals are non-overlapping and 
return type is a non-array. Will leave as is and document as convex hull 
in this case.

> 4. I have noted that you use operator overloading for union and 
> intersection of `Interval`s. You may also add overloads for the `offset` 
> function in both `Interval` and `Coordinate` with `auto opBinary(string 
> op, T)(T off) if ((op == '+' || op == '-') && isIntegral!T)` and `auto 
> opBinaryRight(string op, T)(T off) if ((op == '+' || op == '-') && 
> isIntegral!T)`.

Very nice. I do miss operator overloading in some of the other languages 
I explored recently.

> I enjoyed reading the manuscript. It highlights the issue clearly and 
> presents the solution without getting lost in details. Ignoring typos at 
> this stage, I have no remarks on it – keep going!

Thanks again for this critical review. As you know we are really pleased 
with how D has accelerated our science and wish to share it with the world.

James


More information about the Digitalmars-d-announce mailing list