[RFC] CSV parser
Robert Jacques
sandford at jhu.edu
Tue Apr 5 08:32:12 PDT 2011
On Tue, 05 Apr 2011 01:44:34 -0400, Jesse Phillips
<jessekphillips+d at gmail.com> wrote:
> I have implemented an input range based CSV parser that works on text
> input[1]. I combined my original implementation with some details of
> David's implementation[2]. It is not ready for formal review as I need to
> update and polish documentation and probably consolidate unit tests.
[snip]
* You should input ranges. It's fine to detect slicing and optimize for
it, but you should support simple input ranges as well.
* I'd think being able to retrieve the headings from the csv would be a
good [optional] feature.
* Exposing the tokenizer would be useful.
* Regarding buffering, it's okay for the tokenizer to expose buffering in
it's API (and users should be able to supply their own buffers), but I
don't think an unbuffered version of csvText itself is correct;
csvByRecord or csvText!(T).byRecord would be more appropriate. And
anyways, since you're only using strings, why is there any buffering going
on at all? string values should simply be sliced, not buffered. Buffering
should only come into play with input ranges.
* There should be a way to specify other separators; I've started using
tab separated files as ','s show up in a lot of data.
* Any thought of parsing a file into a tuple of arrays? Writing csv?
More information about the Digitalmars-d
mailing list