They wrote the fastest parallelized BAM parser in D
Laeeth Isharc via Digitalmars-d
digitalmars-d at puremagic.com
Tue Mar 31 04:04:49 PDT 2015
> As Andrew Brown pointed out, visualization is not behind
> Pythons success. Its success lies in the fact that it's a
> language you can hack away in easily.
Sounds right. I am not in the camp that says it is a killer for
D. It would just be nice to have both at least a passable
solution for visualization, and some way of making it
interactive. (The REPL might be one route). The problem with
separating the processes completely and just piping the output
from D code that does the heavy lifting to a python or julia
front end is it may make it more painful to play with and explore
the data. My interests are finance more than science, so that
may lead to a different set of needs. Finishing mathgl and
writing D bindings for bokeh (take a look - it is pretty cool,
particularly to be able to use the browser as client,
acknowledging that it is a tradeoff) is not so much work. But
some help on bokeh particularly would be nice, as I fear picking
one way of implementing the object structure and later finding it
is a mistake.
> the initial euphoria of being able to automatically rename
> files and extract value X from file Y soon gives way to
> frustration when it comes to performance.
Yep.
> The paper shows well that in a world where data processing is
> of utmost importance, and we're talking about huge sets of
> data, languages like Python don't cut it anymore.
I could not agree more, and I do think the intersection of two
trends creates tremendous opportunity for D. It's also
commonsensical to look at notable successes - and I hope it is
not just my biases that lead me to think many of these are in
just this kind of application. Data sets keep getting larger
(but not necessarily more information rich in dollar terms), and
Moore's Law/memory speed+latency is not keeping pace. This is
exactly the kind of change that creeps up on you because not much
changes in a few months (which is the kind of horizon many of us
tend to think in).
People say "what is D's edge", but my personal perception is
"where is the competition for D" in this area. It has to be
native code/JIT, and I refuse to learn Java; it also should be
plastic and lend itself to rapid iteration.
> at the same time there's growing discontent among researchers,
> scientists and engineers as regards performance, simply because
> the data sets are becoming bigger and bigger every day and the
> algorithms are getting more and more refined. Sooner or later
> people will have to find new ways, out of sheer necessity.
upvote. I would love to see any references you have on this -
not because it's not rather obvious to me, but because it is
helpful when talking to other people.
> Don't forget that "the state of the art" can change very
> quickly in IT and the name of the game is anticipating new
> developments rather than taking snapshots of the current state
> of the art and frame them. D really has a lot to offer for data
> processing and I wouldn't rule it out that more and more
> programmers will turn to it for this task.
I fully agree. If we started a section on use cases, would you
be able to write a page or two on D's advantages in data
processing?
More information about the Digitalmars-d
mailing list