Looking for a Code Review of a Bioinformatics POC
duck_tape
sstadick at gmail.com
Fri Jun 12 12:02:19 UTC 2020
On Friday, 12 June 2020 at 07:25:09 UTC, Jon Degenhardt wrote:
> tsv-utils has the advantage of only needing to support utf-8
> files with Unix newlines, so the code is simpler. (Windows
> newlines are detected, this occurs separately from
> bufferedByLine.) But as you describe, support for a wider
> variety of input cases could be done without sacrificing basic
> performance. iopipe provides much more generic support, and it
> is quite fast.
I will have to look into iopipe for sure. All this info is great.
For this particular benchmark the goal is just to show off some
'high-level' languages and how close to c they can get. If I can
avoid going way into the weeds writing my own output methods,
that's more in the spirit of things.
However, I do intend to be using D for bioinformatics, which is
incredibly IO intensive, so much of this will be put to good use.
For speedups with getting my hands dirty:
- Does writef and company flush on every line? I still haven't
found the source of this.
- It looks like I could use {f}printf if I really wanted to:
https://forum.dlang.org/post/hzcjbanvkxgohkbvjnkv@forum.dlang.org
It's particularly interesting what is said about short lines
doing worse, because these are pretty short, less than 20
characters usually.
More information about the Digitalmars-d-learn
mailing list