dataframe implementations

Jay Norwood via Digitalmars-d-learn digitalmars-d-learn at puremagic.com
Mon Nov 2 05:54:08 PST 2015


I was reading about the Julia dataframe implementation yesterday, 
trying to understand their decisions and how D might implement.

 From my notes,
1. they are currently using a dictionary of column vectors.
2. for NA (not available) they are currently using an array of 
bytes, effectively as a Boolean flag, rather than a bitVector, 
for performance reasons.
3. they are not currently implementing hierarchical headers.
4. they are transforming non-valid symbol header strings (read 
from csv, for example) to valid symbols by replacing '.' with 
underscore and prefixing numbers with 'x', as examples.  This 
allows use in expressions.
5. Along with 4., they currently have @with for DataVector, to 
allow expressions to use, for example, :symbol_name instead of 
dv[:symbol_name].
6. They have operation symbols for per element operations on two 
vectors, for example a ./ b expresses applying the operation to 
the vector.
7. They currently only have row indexes,  no row names or symbols.

I saw someone posting that they were working on DataFrame 
implementation here, but haven't been able to locate any code in 
github, and was wondering what implementation decisions are being 
made here.  Thanks.


More information about the Digitalmars-d-learn mailing list