Review of Jesse Phillips's CSV Parser

Walter Bright newshound2 at digitalmars.com
Sat Oct 29 19:03:08 PDT 2011


On 10/29/2011 3:33 PM, Jesse Phillips wrote:
>> "This parser will loosely follow the RFC-4180"
>>
>> How exactly will it differ from RFC-4180?
>
> I do not know how to address this. After making this statement I list the
> criteria. I don't see the need to list where it differs:
>
> * Records terminate with LF, CR and not just CRLF.
> * Using Double Quotes or Comma can be customized.
>
> I feel that I summarized well enough that listing yet another would be
> redundant.

I saw the criteria listed, but I didn't see how it was connected with it being 
different from RFC-4180. Are those criteria the differences? If so, please 
clarify that it is.


>> "RecordList"
>>
>> "Range which provides access to CSV Records and Fields."
>>
>> What is the Range? Do you mean RecordList is a Range? And what does
>> "provides access" mean? If RecordList is a Range, why is it labeled a
>> "List"?
>
> Considering it is the first documentation line of RecordList, I don't see
> where the confusion on "what the range is."
>
> I've re-written the line, it's shorter.
>
> Ok, be prepared for confusion.
>
> RecordList is a range of records, it is not named RecordRange because
> Record is also a range, a range of fields. If you have any suggestions
> for a better name, such as
>
> Document or TabularData
>
> Please let me know what you'd prefer.

When it says "List" I think of a specific data structure, i.e. linked list. 
Since it is not, it shouldn't be labeled as one.

How about simply "Records" ?


>> "Array of the heading contained in the file."
>>
>> Now I'm really confused as to what a heading is. I thought it was the
>> first line? Now there's an array of them?
>
> The line did not say "headings," but I have change it to:
>
> "Heading from the input in array form."

An example would make this much, much clearer.

> Is an example needed for both constructors, and maybe an example for
> front too?

I know that when one is intimately familiar with the code, more examples seem 
redundant. They aren't to the new user!


>> No documentation for front() or empty().
>
> These are range primitives, those programming in D already know "returns
> the current element of the range." I find it more appropriate to describe
> what is returned by front() in the description of the range (many ranges
> have become private and have no documentation at all, not even their
> existence).
>
> I will of course add documentation, but recommendations on non-pointless
> documentation would be welcome. Or how to handle the different types
> which can be assigned to Contents.

I suggest at a minimum that the docs for these say that they are part of the 
range interface. I want to press Andrei to write up a doc on "What's a Range" 
and then you can just provide a link to the right spot in that.


>> "Record"
>>
>> "Returned by a RecordList when Contents is a non-struct."
>>
>> RecordList is a struct, it does not return anything.
>
> It is a range, are we not allowed to talk of them as though they were
> high-level concepts? Do I leave off where you'd obtain a Record (it is
> private). Should I say "Returned by an instance of RecordList when
> calling front?"

I think it is very important to be precise in the use of terms in technical 
documentation. A Range doesn't "return" anything either, though there is a 
method of Range which does. A range is a set of elements that can be iterated 
through.


More information about the Digitalmars-d mailing list