Data frames in D?

Russel Winder via Digitalmars-d-learn digitalmars-d-learn at puremagic.com
Sat Dec 27 02:53:41 PST 2014


On Sat, 2014-12-27 at 01:33 +0000, Laeeth Isharc via Digitalmars-d-learn
wrote:
[…]
> Fair argument against an earlier poster but from my perspective, 
> all I meant is that the absence of a shell is not a good reason 
> to write off D for exploring data.  Because there is a shell 
> already that could be developed, and because one can call D from 
> python / Julia in a notebook.

I think we are agreeing. Very lightweight editor and executor of code
fragments is as good, if not better, that the one line REPL.

[…]
> About the future you may or may not be right.  (Whether it is 
> commercially interesting to run workshops in D for stats people 
> is certainly a interesting question.  However given the ways that 
> technology unfolds it may be that it is less relevant for the 
> question I am most interested today in answering).

Part of the problem here is tribalism. Most data science people want to
use the same tools that other data science people use, even though the
issue is to differentiate themselves. Currently R and Python are the
tools of the moment. Julia hasn't made deep penetration, but is totally
focused on trying to replace R and Python for data analysis.

> I want to do things in D myself, and I would find a data frame 
> helpful.  I understand you don't program much in D these days, 
> and that's a reasonable decision, but for those who want to use 
> it to do quantish things with dataframes, perhaps we could think 
> about how to approach the problem.  And having weighed your 
> warnings, if you have any suggestions on how best to implement 
> this, I would be open to these also.

A BLAS library is certainly a precusor, as is very good data
visualization tools, graphs, diagrams etc. It isn't the language per se
that make R, Python and increasingly Julia, but the fact that the
results of the analysis can be rendered graphically.

I know much less about R, but the whole Python/NumPy thing works but
only because it is faster and easier than Python alone. NumPy
performance is actually quite poor. I am finding I can write Python +
Numba code that hugely outperforms that same algorithm using NumPy.

Go is making great play of the fact that it can attract Python people
using Python for system style programming. Go has Gtk and Qt for
graphics. D has Gtk, but no real Qt. But in the end D isn't getting the
traction as the C/Python replacement as Go has done. Go has masses of
people putting a lot of effort into Web. It's not the ideas, it's the
number of people getting on board and doing things.

To get some traction in any of these areas, finance data analysis and
model building, or systems activity, it is all about people doing it,
publicizing it and making things available for others to use. 

Taking the R array types and Pandas' DataFrames and TimeSeries and
building and using D versions is going to be needed for D to get
traction. But it needs to be better than Julia in some way that makes
others sit up and take notice. There has to be the ability to create
some hype. 
-- 
Russel.
=============================================================================
Dr Russel Winder      t: +44 20 7585 2200   voip: sip:russel.winder at ekiga.net
41 Buckmaster Road    m: +44 7770 465 077   xmpp: russel at winder.org.uk
London SW11 1EN, UK   w: www.russel.org.uk  skype: russel_winder
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 181 bytes
Desc: This is a digitally signed message part
URL: <http://lists.puremagic.com/pipermail/digitalmars-d-learn/attachments/20141227/554e0e80/attachment.sig>


More information about the Digitalmars-d-learn mailing list