[GSoC] Dataframes for D

Prateek Nayak lelouch.cpp at gmail.com
Thu May 30 04:38:09 UTC 2019


On Wednesday, 29 May 2019 at 18:41:28 UTC, jmh530 wrote:
> On Wednesday, 29 May 2019 at 18:00:02 UTC, Prateek Nayak wrote:
>> [snip]
>
> Glad to see progress being made on this!
>
> Somewhat tangentially on the interoperability, I have also made 
> quite a bit of use with R's data.frames. One difference between 
> that and what I have seen of the implementation is that R's 
> data.frames allow for different columns to be different types. 
> This makes certain kinds of analysis of groups very easy. For 
> instance, right now I'm working with a dataset whose columns 
> are doubles, dates, integers, bools, and strings. I can do the 
> equivalent of groupby on the strings as "factors" in R and it's 
> pretty straightforward to get everything working nicely.

On a second thought, my mentor Nicholas Wilson led me to an 
interesting Github Gist
-> https://gist.github.com/aG0aep6G/a1b87df1ac5930870ffe

A similar structure can be used to represent non homogeneous 
data. The DataFrame structure can be overloaded for such an 
integration. However homogeneous DataFrame still remain the main 
objective for now. This integration will definitely happen once 
the homogeneous DataFrame comes close to looking and working like 
an actual DataFrame.

I'll keep you updated here in case I find anything better for non 
homogeneous data and when the whole things starts to take shape.


More information about the Digitalmars-d mailing list