[GSoC] Dataframes for D

Prateek Nayak lelouch.cpp at gmail.com
Wed Jun 26 05:41:48 UTC 2019


On Tuesday, 25 June 2019 at 21:07:35 UTC, jmh530 wrote:
> On Tuesday, 25 June 2019 at 20:44:43 UTC, Prateek Nayak wrote:
>> [snip]
>> 2) Sure, the overload can be made but what are you 
>> specifically looking for?
>> apply(f, axis)(indexes) ?
>> [snip]
>
> I see
> void apply(alias Fn, int axis, T)(T index)
> and
> void apply(alias Fn)()
> in the current implementation.
>
> I think you interpreted what I am asking as something like
> void apply(alias Fn, int axis, T[])(T[] indices)
> which also might make sense.
>
> But I guess I was suggesting a little simpler as
> void apply(alias Fn, int axis)()
> so that it applies to all the rows or columns.
>
> This is particularly relevant in the homogeneous data case. My 
> motivation reflects a common use case of the apply function in 
> R to calculate summary statistics of an array/matrix by column 
> or row. For instance, I might want to calculate the standard 
> deviation of every column.

The apply right now works exactly as
void apply(alias Fn, int axis, T)(T indices)
indices can be an array of integer or a 2D array of string index
https://github.com/Kriyszig/magpie/blob/dec86d1942f9c9b4db31438407798329af0aed96/source/magpie/dataframe.d#L1200

The overload you need also exists: apply(Fn)
https://github.com/Kriyszig/magpie/blob/dec86d1942f9c9b4db31438407798329af0aed96/source/magpie/dataframe.d#L1246

Unittest for apply -
https://github.com/Kriyszig/magpie/blob/dec86d1942f9c9b4db31438407798329af0aed96/source/magpie/dataframe.d#L2821

I agree, things like mean and standard deviation calculations are 
of utmost importance in data science. Aggregate will bring such 
features as inbuilt functions. Count, Min, Max, Mean, SD, 
Variance, etc.
This will be added soon (by soon I mean somewhere between the 
final week of this stage [possibly sooner] and the fist week of 
the next - As soon as groupBy is stable, I will get onto 
aggregate)
Sorry for the inconvenience.


More information about the Digitalmars-d mailing list