Announcement and Request: Typesafe Coordinate Systems for High-Throughput Sequencing Applications
James Blachly
james.blachly at gmail.com
Thu Sep 2 02:22:39 UTC 2021
On 9/1/21 5:01 AM, Arne Ludwig wrote:
> I am happy to hear of your latest idea of creating type-safe coordinate
> systems. It's a great idea!
>
> After reading the code on GitHub, I have only one major remark: IMHO, it
> would be great to separate the novel coordinates systems from any
> `htslib` dependencies ([see lines
> 47-50](https://github.com/blachlylab/dhtslib/blob/e3b5af14e9eefa54bcc27bc0fcc9066dc3a4ea54/source/dhtslib/coordinates.d#L47-L50))
> as there are only auxiliary functions that use both the novel
> coordinates systems and `htslib`. The greater goal I have in mind is to
> provide the coordinate systems in a separate DUB sub-package (e.g.
> `dhtslib:coordinates`) that requires only a D compiler. That makes
> integration into existing projects that do not need `htslib` much easier.
This is an absolutely **outstanding** idea. Those imports were only to
reuse an htslib `chr:X-Y` string parsing function, but we can trivially
rewrite this in native D to enable sub-package independence!
> Also, I have a short list of minor, technical remarks:
>
> 1. The returned type in [line
> 114](https://github.com/blachlylab/dhtslib/blob/e3b5af14e9eefa54bcc27bc0fcc9066dc3a4ea54/source/dhtslib/coordinates.d#L114)
> has a typo, there is an additional 's'.
Ahh, the curse of templates. Without 100% test coverage these things
which would cause failure to compile in non-template code seem to always
sneak in. Thank you so much.
> 2. The array of identifiers `CoordSystemLabels` in [line
> 203](https://github.com/blachlylab/dhtslib/blob/e3b5af14e9eefa54bcc27bc0fcc9066dc3a4ea54/source/dhtslib/coordinates.d#L203)
> is a bit unsafe and not strictly required for two reasons:
A very excellent suggestion. I am still a metaprogramming novice.
> 3. The function `unionImpl` in [line
> 326](https://github.com/blachlylab/dhtslib/blob/e3b5af14e9eefa54bcc27bc0fcc9066dc3a4ea54/source/dhtslib/coordinates.d#L326)
> actually computes the convex hull of the two intervals which should be
> noted in the doc comment for completeness' sake.
Yes, we had some internal debate about the appropriate result of both
union and intersect operations when intervals are non-overlapping and
return type is a non-array. Will leave as is and document as convex hull
in this case.
> 4. I have noted that you use operator overloading for union and
> intersection of `Interval`s. You may also add overloads for the `offset`
> function in both `Interval` and `Coordinate` with `auto opBinary(string
> op, T)(T off) if ((op == '+' || op == '-') && isIntegral!T)` and `auto
> opBinaryRight(string op, T)(T off) if ((op == '+' || op == '-') &&
> isIntegral!T)`.
Very nice. I do miss operator overloading in some of the other languages
I explored recently.
> I enjoyed reading the manuscript. It highlights the issue clearly and
> presents the solution without getting lost in details. Ignoring typos at
> this stage, I have no remarks on it – keep going!
Thanks again for this critical review. As you know we are really pleased
with how D has accelerated our science and wish to share it with the world.
James
More information about the Digitalmars-d-announce
mailing list