Making byLine faster: we should be able to delegate this

Andrei Alexandrescu via Digitalmars-d digitalmars-d at puremagic.com
Mon Mar 23 16:33:07 PDT 2015


On 3/23/15 2:42 PM, Steven Schveighoffer wrote:
> On 3/23/15 10:59 AM, Andrei Alexandrescu wrote:
>> On 3/23/15 7:52 AM, Steven Schveighoffer wrote:
>>> On 3/22/15 3:03 AM, Andrei Alexandrescu wrote:
>>>
>>>> * assumeSafeAppend() was unnecessarily used once per line read. Its
>>>> removal led to a whopping 35% on top of everything else. I'm not sure
>>>> what it does, but boy it does takes its sweet time. Maybe someone
>>>> should
>>>> look into it.
>>>
>>> That's not expected. assumeSafeAppend should be pretty quick, and
>>> DEFINITELY should not be a significant percentage of reading lines. I
>>> will look into it.
>>
>> Thanks!
>>
>>> Just to verify, your test application was a simple byline loop?
>>
>> Yes, the code was that in
>> http://stackoverflow.com/questions/28922323/improving-line-wise-i-o-operations-in-d/29153508#29153508
>>
>
> My investigation seems to suggest that assumeSafeAppend is not using
> that much time for what it does. The reason for the "35%" is that you
> are talking 35% of a very small value.

I don't see the logic here. Unless the value is so small that noise 
margins become significant (it isn't), 35% is large.

> At that level, and with these
> numbers of calls, combined with the fact that the calls MUST occur
> (these are opaque functions), I think we are talking about a non issue
> here.

I disagree with this assessment. In this case it takes us from losing to 
winning to Python.

> This is what assumeSafeAppend does:
>
> 1. Access TypeInfo and convert array to "void[]" array (this could
> probably be adjusted to avoid using the TypeInfo, since assumeSafeAppend
> is a template).
> 2. Look up block info, which should be a loop through 8 array cache
> elements.
> 3. Verify the block has the APPENDABLE flag, and write the new "used"
> space into the right place.
>
> I suspect some combination of memory cache failures, or virtual function
> calls on the TypeInfo, or failure to inline some functions is what's
> slowing it down. But let's not forget that the 35% savings was AFTER all
> the original savings. On my system, using a 2 million line file, the
> original took 2.2 seconds, the version with the superfluous
> assumeSafeAppend took .3 seconds, without it takes .15 seconds.
>
> Still should be examined further, but I'm not as concerned as I was before.

We should.


Andrei



More information about the Digitalmars-d mailing list