[GSoC] Dataframes for D

Prateek Nayak lelouch.cpp at gmail.com
Tue Jun 25 17:25:34 UTC 2019


>>> [snip]

-------------
Week 4 Update
-------------

This marks the completion of Stage I of Google Summer of Code 
2019. It seems like it was only yesterday when I started working 
on this project and it has already been a month.

--------------------------
So what happened last week
--------------------------
* apply - to apply a function on a row/column
* function to convert a column of data to level of Index
* drop - to drop a row/column

Going back to the original proposal, I had allocated some time 
for optimisations in case there was time:
I was testing old parser with large files and it failed 
miserably. So I redesigned the from_csv function and added it to 
the library as fastCSV.
fastCSV gives 40x speed improvement over from_csv and fastCSV 
will eventually replace from_csv

* fastCSV was added to the library.


-------------------
Plans for this week
-------------------

Plans for this entire stage isn't strictly on a week by week 
timeline but the following things will be dealt sequentially 
throughout this stage:

This stage is reserved for implementation of groupBy. So for the 
beginning the internal structure and grouping will be decided. 
Later things like display and combining into a DataFrame struct 
will be dealt with.

These tasks were scheduled for Stage-III but will again fall 
under sequential implementation. If the above tasks are done. The 
following tasks will be dealt with:
* Aggregate [with complete set of popular operations]
* Join will be implemented to merge two DataFrame.

Aggregate was reserved for later stages to support implementation 
for both normal DataFrame and groubBy at once.


----------
Roadblocks
----------
This week there hasn't been any roadblocks. I needed the help of 
my mentors to solve a couple errors here and there but other than 
that things were smooth.
As for the future roadblocks, I cannot see any apparent ones but 
then again they show up when you are least expecting them :(


More information about the Digitalmars-d mailing list