Making byLine faster: we should be able to delegate this

Andrei Alexandrescu via Digitalmars-d digitalmars-d at puremagic.com
Sun Mar 22 00:03:16 PDT 2015


I just took a look at making byLine faster. It took less than one evening:

https://github.com/D-Programming-Language/phobos/pull/3089

I confess I am a bit disappointed with the leadership being unable to 
delegate this task to a trusty lieutenant in the community. There's been 
a bug opened on this for a long time, it gets regularly discussed here 
(with the wrong conclusions ("we must redo D's I/O because FILE* is 
killing it!") about performance bottlenecks drawn from unverified 
assumptions), and the techniques used to get a marked improvement in the 
diff above are trivial fare for any software engineer. The following 
factors each had a significant impact on speed:

* On OSX (which I happened to test with) getdelim() exists but wasn't 
being used. I made the implementation use it.

* There was one call to fwide() per line read. I used simple caching (a 
stream's width cannot be changed once set, making it a perfect candidate 
for caching).

(As an aside there was some unreachable code in ByLineImpl.empty, which 
didn't impact performance but was overdue for removal.)

* For each line read there was a call to malloc() and one to free(). I 
set things up that the buffer used for reading is reused by simply 
making the buffer static.

* assumeSafeAppend() was unnecessarily used once per line read. Its 
removal led to a whopping 35% on top of everything else. I'm not sure 
what it does, but boy it does takes its sweet time. Maybe someone should 
look into it.

Destroy.


Andrei


More information about the Digitalmars-d mailing list