Making byLine faster: we should be able to delegate this
John Colvin via Digitalmars-d
digitalmars-d at puremagic.com
Mon Mar 23 08:00:05 PDT 2015
On Sunday, 22 March 2015 at 07:03:14 UTC, Andrei Alexandrescu
> I just took a look at making byLine faster. It took less than
> one evening:
> I confess I am a bit disappointed with the leadership being
> unable to delegate this task to a trusty lieutenant in the
> community. There's been a bug opened on this for a long time,
> it gets regularly discussed here (with the wrong conclusions
> ("we must redo D's I/O because FILE* is killing it!") about
> performance bottlenecks drawn from unverified assumptions), and
> the techniques used to get a marked improvement in the diff
> above are trivial fare for any software engineer. The following
> factors each had a significant impact on speed:
> * On OSX (which I happened to test with) getdelim() exists but
> wasn't being used. I made the implementation use it.
> * There was one call to fwide() per line read. I used simple
> caching (a stream's width cannot be changed once set, making it
> a perfect candidate for caching).
> (As an aside there was some unreachable code in
> ByLineImpl.empty, which didn't impact performance but was
> overdue for removal.)
> * For each line read there was a call to malloc() and one to
> free(). I set things up that the buffer used for reading is
> reused by simply making the buffer static.
> * assumeSafeAppend() was unnecessarily used once per line read.
> Its removal led to a whopping 35% on top of everything else.
> I'm not sure what it does, but boy it does takes its sweet
> time. Maybe someone should look into it.
What would be really great would be a performance test suite for
phobos. D is reaching a point where "It'll probably be fast
because we did it right" or "I remember it being fast-ish 3 years
ago when i wrote a small toy test" isn't going to cut it. Real
data is needed, with comparisons to other languages where
More information about the Digitalmars-d