std.range.byLine

"Nordlöw" via Digitalmars-d-learn digitalmars-d-learn at puremagic.com
Thu Sep 11 13:03:25 PDT 2014


On Thursday, 11 September 2014 at 10:19:17 UTC, monarch_dodra 
wrote:
> Well, the issue is that this isn't very portable for *reading*, 
> as even on linux, you may read files with "\r\n" line endings 
> (It's "standard" for csv files, for example), or read "\n" 
> terminated files on windows.
> The issue is that (currently) we don't have any splitter that 
> operates on multiple needles. *That'd* be what needs to be 
> written (probably not too hard either, since "find" already 
> exists).

Good idea. So its "just" a matter of extending splitter with 
std.algorithm.find with these three keys:
- \n
- \r
- \r\n
then? Or are there more encodings to choose from?

> We also have splitLines, 
> "http://dlang.org/phobos/std_string.html#.splitLines". Is that 
> good enough for you by any chance? Or do you need it to 
> actually be lazy?

Lazyness is good in this case because my input files are 
Gigabytes in size :) I'm playing around with single-pass-parsing 
ConceptNet5 CSV-files at

https://github.com/nordlow/justd/blob/master/conceptnet5.d


More information about the Digitalmars-d-learn mailing list