[phobos] CSVRange: RFC

Jesse Phillips jesse.k.phillips at gmail.com
Sun Jan 30 12:21:27 PST 2011


In some way your comments apply to my implementation, I'll answer
assuming the latest design I have.

https://github.com/he-the-great/JPDLibs/tree/separator

And as David mentioned, it probably would be best to go with mine, but
it too is not ready for inclusion.

On Sun, Jan 30, 2011 at 11:11 AM, Andrei Alexandrescu <andrei at erdani.com> wrote:
> Code review:
>
> #50 RefRange is really ForceInputRange. What do you need it for? It's
> unusual to want to reduce the capabilities of a range.
>
> #78 isCharRange is incorrect. Correct version:
>
> enum bool isCharRange = isInputRange!R && isSomeChar!(ElementType!R);
>
> #100 Why not struct?

Answering for all of the above as mine doesn't use any of these.

I make use of two ranges and a token function. Most of the work is
done by csvNextToken and has the most unittests against it. I then
build a range which iterates each token until the end of the record,
this is struct Record. On top of that I have a range which iterates
over each Record, this is RecordList.

I avoid the need to create reference semantics by having both Record
and csvNextToken take a ref parameter of the Range.

> #102 private Appender!(char[]) _front;

As my result may not be an array, I use contactination. I do not have
a constraint on this yet...

> #306 This is CsvRange not a CsvFile as it builds on another range (that may
> or may not be backed up by a file)

As said, mine are named Record and RecordList, and I believe these are
good names as we aren't dealing with lines or files. The only concern
I have is that Record has many means, though I believe I am using it
correctly for CSV row data.

CsvRecord? DataRecord?

> #387 The name is confusing - it's a class with struct in its name.

I believe this is an implementation detail. I use csvText!CastTo(...)
where CastTo can be a struct. It will be interesting to see what I do
when I get heading support put in.

> #582 We also need a way to read CSV files into string arrays in case the
> user just wants to do the parsing and decide on typing later. Seemingly the
> current design forces choice of type before parsing.

Mine defaults to a slice of the original range, with the exception of
quoted entries. A helper function to return a Range[][] could be made.
But I think just leaving standard Range semantics is best.

> Documentation review:
>
> * O(1) is a bit inaccurate - memory consumed is proportional to that of one
> element. What you might have meant is that it does not depend on the number
> of lines in a file or on the number of CSV elements in a line.

I believe my new method has a better footprint then David's, but it no
longer operates on an InputRange, but a range with slicing and
appending. If there is no quoted data it just returns the slice of the
needed data, otherwise it appends slices of the data to a new Range.

It may be useful to have an implementation which doesn't make use of these.

> * A few artifacts have no examples.

I haven't done much for examples either.

> * The example should compile. getCharRange() does not exist. FWIW your
> design should work with byLine().

As David said, you really can't do this. You can make it work, but it
is more trouble then it is worth. I built my design off of the ideas
found in Splitter.

Feedback and suggestion are welcome, but it really hasn't seen the
comment or code standard which it needs for easy understanding.


More information about the phobos mailing list