some regex vs std.ascii vs handcode times
Jay Norwood
jayn at prismnet.com
Mon Mar 19 16:37:16 PDT 2012
On Monday, 19 March 2012 at 17:23:36 UTC, Andrei Alexandrescu
wrote:
> On 3/18/12 11:12 PM, Jay Norwood wrote:
>> I'm timing operations processing 10 2MB text files in
>> parallel. I
>> haven't gotten to the part where I put the words in the map,
>> but I've
>> done enough through this point to say a few things about the
>> measurements.
>
> Great work! This prompts quite a few bug reports and
> enhancement suggestions - please submit them to bugzilla.
I don't know if they are bugs. On D.learn I got the explanation
that the matches.captures.length() just returns the matches in
the expressions surrounded by (), so I don't think this can be
used ,other than in a for loop, to count lines, for example.
std.algorithm.count works ok, but I was hoping that there was
something in the ctRegex that would make it work as fast as the
hand-coded string scan.
>
> Two quick notes:
>
>> On the other end of the spectrum is the byLine version of the
>> read. So
>> this is way too slow to be promoting in our examples, and if
>> anyone is
>> using this in the code you should instead read chunks ...
>> maybe 1MB like
>> in my example later below, and then split up the lines
>> yourself.
>>
>> // read files by line ... yikes! don't want to do this
>> //finished! time: 485 ms
>> void wcp_byLine(string fn)
>> {
>> auto f = File(fn);
>> foreach(line; f.byLine(std.string.KeepTerminator.yes)){
>> }
>> }
>
> What OS did you use? (The implementation of byLine varies a lot
> across OSs.)
I'm doing everything now on win7-64 right now.
>
> I wanted for a long time to improve byLine by allowing it to do
> its own buffering. That means once you used byLine it's not
> possible to stop it, get back to the original File, and
> continue reading it. Using byLine is a commitment. This is what
> most uses of it do anyway.
>
>> Ok, this was the good surprise. Reading by chunks was faster
>> than
>> reading the whole file, by several ms.
>
> What may be at work here is cache effects. Reusing the same 1MB
> may place it in faster cache memory, whereas reading 20MB at
> once may spill into slower memory.
Yes, I would guess that's the problem. This corei7 has 8MB cache,
and the threadpool creates 7 active tasks by default, as I
understand, so even 1MB blocks is on the border when running
parallel. I'll lower the chunk size to some level that seems
reasonable and retest.
>
>
> Andrei
More information about the Digitalmars-d
mailing list