Using lazy code to process large files

Daniel Kozak via Digitalmars-d-learn digitalmars-d-learn at puremagic.com
Wed Aug 2 06:03:00 PDT 2017


something like file.byLine.map!(a=>a.byCodeUnit)

On Wed, Aug 2, 2017 at 3:01 PM, Daniel Kozak <kozzi11 at gmail.com> wrote:

> using http://dlang.org/phobos/std_utf.html#byCodeUnit could help
>
> On Wed, Aug 2, 2017 at 2:59 PM, Martin DraĊĦar via Digitalmars-d-learn <
> digitalmars-d-learn at puremagic.com> wrote:
>
>> Dne 2.8.2017 v 14:45 Steven Schveighoffer via Digitalmars-d-learn
>> napsal(a):
>>
>> > The problem is that you are 2 ranges deep when you apply splitter. The
>> > result of the map is a range of ranges.
>> >
>> > Then when you apply stringStripleft, you are applying to the map result,
>> > not the splitter result.
>> >
>> > What you need is to bury the action on each string into the map:
>> >
>> > .map!(a => a.splitter(",").map!(stringStripLeft).join(","))
>> >
>> > The internal map is because stripLeft doesn't take a range of strings
>> > (the result of splitter), it takes a range of dchar (which is each
>> > element of splitter). So you use map to apply the function to every
>> > element.
>> >
>> > Disclaimer: I haven't tested to see this works, but I think it should.
>> >
>> > Note that I have forwarded your call to join, even though this actually
>> > is not lazy, it builds a string out of it (and actually probably a
>> > dstring). Use joiner to do it truly lazily.
>> >
>> > I will also note that the result is not going to look like what you
>> > think, as outputting a range looks like this: [element, element,
>> > element, ...]
>> >
>> > You could potentially output like this:
>> >
>> > output.write(result.joiner("\n"));
>> >
>> > Which I think will work. Again, no testing.
>> >
>> > I wouldn't expect good performance from this, as there is auto-decoding
>> > all over the place.
>> >
>> > -Steve
>>
>> Thanks Steven for the explanation. Just to clarify - what would be
>> needed to avoid auto-decoding in this case? Process it all as an arrays,
>> using byChunk to read it, etc?
>>
>> @kdevel: Thank you for your solution as well.
>>
>> Martin
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.puremagic.com/pipermail/digitalmars-d-learn/attachments/20170802/6c341f1f/attachment.html>


More information about the Digitalmars-d-learn mailing list