Reminder: Two days left to review Jesse Phillips' CSV Parser

Jesse Phillips jessekphillips+d at gmail.com
Wed Nov 9 21:17:48 PST 2011


On Wed, 09 Nov 2011 20:46:51 -0500, dsimcha wrote:

> On 11/9/2011 8:02 PM, dsimcha wrote:
>> As a reminder, the review of Jesse Phillips' CSV parser ends at the end
>> of Friday and will be followed by one week of voting. Please speak up
>> now about any remaining issues.
> 
> BTW, here's my second round of review now that some changes have been
> made:
> 
> Design-level stuff:
> 
> How do I use csvReader to process a CSV file as it's being read in (i.e.
> I don't want to ever have the whole file in memory)?  I guess I could do
> something like:
> 
> import std.algorithm, std.stdio, std.csv, std.conv;
> 
> void main() {
>      auto file = File("foo.csv");
>      auto chars = joiner(file.byLine(), '\n');
> 
>      auto reader = csvReader(chars);
> }
> 
> If this would work, though, then it should be documented.

I do not see why the CSV documentation should be the place to document 
how to read a file in parts. As we don't have byChar and this is non-
trivial workaround I could see wanted to have it somewhere and CSV 
parsing might be a place to apply it.

What really don't like about it is that you don't get to see the input. 
It would be like std.algorithm showing how sorting worked by showing how 
to read in an array from a file and sorting it.

This is something people need to know how to do so we should have a good 
tutorial on it, but I don't think this is the place to document it (stdio 
or std.file is more appropriate). If you strongly disagree I will find 
some place to throw in another example.

> I'd also like
> to see csvReader tested with lowest-common-denominator input ranges
> instead of just strings, which are bidirectional ranges and can be
> treated as random access if they only have ASCII characters.

Yeah, through in a unittest wrapping a wstring with only input range 
interface.

> Also, IMHO
> we really need a ByChar range in std.stdio instead of the joiner kludge
> I used, though I'm not sure about the details with regard to how Unicode
> support should work.

I thought there was an undocumented such function, but maybe not.
 
> Still no support for writing CSVs?  I understand writing is complicated
> from an API design perspective.  Maybe it would be easiest to just
> provide an enquote() function that adds quotes to a field and escapes
> any that are already inside the field and let the user handle the higher
> level stuff, since this is pretty simple.

I'm going to make something, try a couple ideas out. Thanks for the 
naming suggestion. I ended up deciding not to do anything on it as even 
if I did get it done, it wouldn't have appropriate review time. So I'll 
let the voting be the deciding factor on whether the library is delayed 
because it doesn't have csvFormatter.

After more thought I think an enquote() type function must exist even 
with a csvFormatter, but even laying out documentation on that isn't 
trivial.

> Minor documentation nitpicks:
> 
> CSVException:  IncompletCellException -> IncompleteCellException (typo)
> 
> The first csvReader example:  You declare the variable count but don't
> use it.
> 
> Records, Record:  front(), empty(), popFront() should not mention
> std.range.InputRange.  These interfaces are virtual function-based
> interfaces in case someone needs virtual function-based access to a
> range.  They are not used in most cases when ranges are used because
> templates are usually better.

Yes, my bad. I keep forgetting we have such interfaces there. I've 
instead pointed it to isInputRange which is what I meant to do. But then 
I still call it an interface... which it is, but I don't know what we 
want to call them. Concepts? :D I want this link in there so if you have 
better suggestions...

Thank you.


More information about the Digitalmars-d mailing list