nested csv into tsv

Ali Çehreli acehreli at yahoo.com
Sun Mar 18 09:53:20 PDT 2012


On 03/18/2012 07:45 AM, bioinfornatics wrote:
 > dear, i have this data:
 > ________________________________
 > data1	data2	data3a;data3b;data3c
 > cata1	cata2	cata3a;cata3b;cata3c
 > tata1	tata2	tata3a;tata3b;tata3c
 > ________________________________
 >
 > field are sepaated by tab but third field contain data separeted by semi
 > colon
 >
 > I have try:
 > ________________________________
 > import std.csv;
 > import std.string;
 > import std.stdio;
 >
 > struct Data{
 >      public:
 >          string field1;
 >          string field2;
 >
 >      @property void field3( string field ){
 >          _field3 = field.split(";");
 >      }
 >      @property string[] field3(  ){
 >          return _field3;
 >      }

Besides the confusion that Jesse Phillips has pointed out, csvReader 
cannot decide to treat those two property functions as if they represent 
a member of Data.

 >
 >      private:
 >          string[] _field3;

Data still has three members: field1, field2, and _field3.

The problem is, although the format clearly states that there are three 
strings that are delimited by '\t', the third field of the struct is not 
a string.

 > }
 >
 > void main(){
 >      Data[] result;
 >      File f = File( "data.csv", "r" );
 >      foreach( char[] line; f.byLine() ){
 >          result ~= csvReader!Data(line, '\t').front;
 >      }
 > }

So the solution is that _field3 must be a string:

import std.csv;
import std.string;
import std.stdio;

struct Data{
     public:
         string field1;
         string field2;

     private:
         string _field3;
}

void main(){
     Data[] result;
     File f = File( "data.csv", "r" );
     foreach( char[] line; f.byLine() ){
         result ~= csvReader!Data(line, '\t').front;
     }

     writeln(result);
}

You must provide the properties on top of that:

import std.csv;
import std.string;
import std.stdio;

struct Data{
     public:
         string field1;
         string field2;

     void field3( string[] field ) @property {
         _field3 = field.join();
     }

     string[] field3(  ) @property {
         return _field3.split(";");
     }

     string toString() {
         return format("%s,%s,%s", field1, field2, field3);
     }

     private:
         string _field3;
}

void main(){
     Data[] result;
     File f = File( "data.csv", "r" );
     foreach( char[] line; f.byLine() ){
         result ~= csvReader!Data(line, '\t').front;
     }

     writeln(result);
}

Note that to avoid confusing the readers, the property functions both 
use string[], not string. (I've also put @property at the end of the 
function signature, which I started to favor recently.)

The optimizations can come after that. The following calls split() only 
whene necessary:

import std.csv;
import std.string;
import std.stdio;

struct Data{
     public:
         string field1;
         string field2;

     void field3( string[] field ) @property {
         _field3 = field;
         _raw_field3 = null;
     }

     string[] field3(  ) @property {
         if (_raw_field3 !is null) {
             _field3 = _raw_field3.split(";");
         }
         return _field3;
     }

     string toString() {
         return format("%s,%s,%s", field1, field2, field3);
     }

     private:
         string _raw_field3;
         string[] _field3;
}

void main(){
     Data[] result;
     File f = File( "data.csv", "r" );
     foreach( char[] line; f.byLine() ){
         result ~= csvReader!Data(line, '\t').front;
     }

     writeln(result);
}

Ali



More information about the Digitalmars-d-learn mailing list