[RFC] CSV parser

Robert Jacques sandford at jhu.edu
Tue Apr 5 08:32:12 PDT 2011


On Tue, 05 Apr 2011 01:44:34 -0400, Jesse Phillips  
<jessekphillips+d at gmail.com> wrote:
> I have implemented an input range based CSV parser that works on text
> input[1]. I combined my original implementation with some details of
> David's implementation[2]. It is not ready for formal review as I need to
> update and polish documentation and probably consolidate unit tests.
[snip]

* You should input ranges. It's fine to detect slicing and optimize for  
it, but you should support simple input ranges as well.

* I'd think being able to retrieve the headings from the csv would be a  
good [optional] feature.

* Exposing the tokenizer would be useful.

* Regarding buffering, it's okay for the tokenizer to expose buffering in  
it's API (and users should be able to supply their own buffers), but I  
don't think an unbuffered version of csvText itself is correct;  
csvByRecord or csvText!(T).byRecord would be more appropriate. And  
anyways, since you're only using strings, why is there any buffering going  
on at all? string values should simply be sliced, not buffered. Buffering  
should only come into play with input ranges.

* There should be a way to specify other separators; I've started using  
tab separated files as ','s show up in a lot of data.

* Any thought of parsing a file into a tuple of arrays? Writing csv?


More information about the Digitalmars-d mailing list