[RFC] CSV parser
Jesse Phillips
jessekphillips+d at gmail.com
Mon Apr 4 22:44:34 PDT 2011
I have implemented an input range based CSV parser that works on text
input[1]. I combined my original implementation with some details of
David's implementation[2]. It is not ready for formal review as I need to
update and polish documentation and probably consolidate unit tests.
It provides a very simple interface which can either iterate over all
elements individually or each record can be stored in a struct. The unit
tests and examples[3] do a good job showing the interface, but here is
just one (taken from unit test) using a struct and header:
string str = "a,b,c\nHello,65,63.63\nWorld,123,3673.562";
struct Layout
{
int value;
double other;
string name;
}
auto records = csvText!Layout(str, ["b","c","a"]);
Layout ans[2];
ans[0].name = "Hello";
ans[0].value = 65;
ans[0].other = 63.63;
ans[1].name = "World";
ans[1].value = 123;
ans[1].other = 3673.562;
int count;
foreach (record; records)
{
assert(ans[count].name == record.name);
assert(ans[count].value == record.value);
assert(ans[count].other == record.other);
count++;
}
assert(count == 2);
The main implementation is in the function csvNextToken. I'm thinking it
might be useful to have this function public as it will allow for writing
a parser for or recovering from malformed data.
In order to be memory efficient appender is reused for each iteration.
However the default behavior does result in a copying being taken. To
prevent the copy being made just provide the type as char[]
string str = `one,two,"three ""quoted""","",` ~ "\"five\nnew line
\"\nsix";
auto records = csvText!(char[])(str);
foreach(record; records)
{
foreach(cell; record)
{
writeln(cell);
}
}
If your structure stores char[] instead of string you will also observe
the overwriting behavior, should this be fixed?.
So feel free to suggest names, implementation correction, or
documentation. Or giving a thumbs up. The more interest, the more
interest I'll have in getting this done sooner :)
1. https://github.com/he-the-great/JPDLibs/blob/csv/csv/csv.d
2. http://lists.puremagic.com/pipermail/phobos/2011-January/004300.html
3. https://github.com/he-the-great/JPDLibs/tree/csv/examples
More information about the Digitalmars-d
mailing list