[GSoC] Dataframes for D

Prateek Nayak lelouch.cpp at gmail.com
Tue Jun 25 20:44:43 UTC 2019


On Tuesday, 25 June 2019 at 17:54:36 UTC, jmh530 wrote:
>
> Glad to see you're still making great progress.
>
> I had worked on the byDim function in mir.ndslice.topology is 
> byDim because I had wanted the same sort of functionality as 
> R's apply. It works a little differently than R's, but I find 
> it very good for a lot of things. Your version of apply (I'm 
> looking at the apply branch of magpie) looks like it operates a 
> bit like a byDim chained with an each, so byDim!N.each!f. 
> However, it also has this index variable allowing it to skip 
> rows or something (I'm not really sure if this feature pulls 
> its weight...).
>
> So I have two questions: 1) does byDim also work with 
> dataframes?, 2) can you add an overload that is apply(f, axis) 
> without the index parameter?
>
> One of my take-a-ways from looking at the apply function (again 
> just looking at that apply branch) is that you might benefit 
> from using more of what is Ilya has already put in mir.ndslice 
> where available. For instance, the overload of apply that is 
> just apply!f is basically the same as mir's each, but each has 
> more features.

1) Current, byDim doesn't work on DataFrame DataFrame.
2) Sure, the overload can be made but what are you specifically 
looking for?
apply(f, axis)(indexes) ?

You are right, apply works like byDim!axis.each on particular 
columns/rows.
I'll look into Mir's implementation. Thanks for that advice. I do 
believe apply can be strengthened to account for different use 
cases.
When the heterogeneous DataFrame support came, mir-algorithms was 
dropped from dependencies and Structure of Array implementation 
was taken up using TypeTuples. Once the basic working is solid, 
I'll port useful features from Mir to Magpie.


More information about the Digitalmars-d mailing list