They wrote the fastest parallelized BAM parser in D

Paulo Pinto via Digitalmars-d digitalmars-d at puremagic.com
Tue Mar 31 07:17:12 PDT 2015


On Tuesday, 31 March 2015 at 11:04:50 UTC, Laeeth Isharc wrote:
>
>> As Andrew Brown pointed out, visualization is not behind 
>> Pythons success. Its success lies in the fact that it's a 
>> language you can hack away in easily.
>
> Sounds right.  I am not in the camp that says it is a killer 
> for D.  It would just be nice to have both at least a passable 
> solution for visualization, and some way of making it 
> interactive.  (The REPL might be one route).  The problem with 
> separating the processes completely and just piping the output 
> from D code that does the heavy lifting to a python or julia 
> front end is it may make it more painful to play with and 
> explore the data.  My interests are finance more than science, 
> so that may lead to a different set of needs.  Finishing mathgl 
> and writing D bindings for bokeh (take a look - it is pretty 
> cool, particularly to be able to use the browser as client, 
> acknowledging that it is a tradeoff) is not so much work.  But 
> some help on bokeh particularly would be nice, as I fear 
> picking one way of implementing the object structure and later 
> finding it is a mistake.
>
>> the initial euphoria of being able to automatically rename 
>> files and extract value X from file Y soon gives way to 
>> frustration when it comes to performance.
>
> Yep.
>
>> The paper shows well that in a world where data processing is 
>> of utmost importance, and we're talking about huge sets of 
>> data, languages like Python don't cut it anymore.
>
> I could not agree more, and I do think the intersection of two 
> trends creates tremendous opportunity for D.  It's also 
> commonsensical to look at notable successes - and I hope it is 
> not just my biases that lead me to think many of these are in 
> just this kind of application.  Data sets keep getting larger 
> (but not necessarily more information rich in dollar terms), 
> and Moore's Law/memory speed+latency is not keeping pace.  This 
> is exactly the kind of change that creeps up on you because not 
> much changes in a few months (which is the kind of horizon many 
> of us tend to think in).
>
> People say "what is D's edge", but my personal perception is 
> "where is the competition for D" in this area.  It has to be 
> native code/JIT, and I refuse to learn Java; it also should be 
> plastic and lend itself to rapid iteration.
>

It is in the JVM and .NET eco-systems. Both have AOT compilers 
available, are able to chew data on GPGPUs and offer SIMD 
libraries.

This is why there is such a strong focus with value types and 
better C interop planned for Java 10, has its use for data 
analysis has been growing.

In HPF, companies prefer to live with JVM workarounds for the 
current limitations than go out and hire a few C++ developers, 
given the amount of money saved in salaries.


--
Paulo


More information about the Digitalmars-d mailing list