Need for speed

H. S. Teoh hsteoh at quickfur.ath.cx
Fri Apr 2 04:01:59 UTC 2021


On Fri, Apr 02, 2021 at 02:36:21AM +0000, Jon Degenhardt via Digitalmars-d-learn wrote:
> On Thursday, 1 April 2021 at 19:55:05 UTC, H. S. Teoh wrote:
[...]
> > It's interesting that whenever a question about D's performance pops
> > up in the forums, people tend to reach for optimization flags.  I
> > wouldn't say it doesn't help; but I've found that significant
> > performance improvements can usually be obtained by examining the
> > code first, and catching common newbie mistakes.  Those usually
> > account for the majority of the observed performance degradation.
> > 
> > Only after the code has been cleaned up and obvious mistakes fixed,
> > is it worth reaching for optimization flags, IMO.
> 
> This is my experience as well, and not just for D. Pick good
> algorithms and pay attention to memory allocation. Don't go crazy on
> the latter. Many people try to avoid GC at all costs, but I don't
> usually find it necessary to go quite that far. Very often simply
> reusing already allocated memory does the trick.

I've been saying this for years, the GC is (usually) not evil. It's
often quite easy to optimize away the main bottlenecks and any remaining
problem becomes not so important anymore.

For example, see this thread:

	https://forum.dlang.org/post/mailman.1589.1415314819.9932.digitalmars-d@puremagic.com

which is continued here (for some reason it was split -- the bad ole
Mailman bug, IIRC):

	https://forum.dlang.org/post/mailman.1590.1415315739.9932.digitalmars-d@puremagic.com


>From a starting point of about 20 seconds total running time, I reduced
it to about 6 seconds by the following fixes:

1) Reduce GC collection frequency: call GC.stop at start of program,
   then manually call GC.collect periodically.

2) Eliminate autodecoding (using .representation or .byChar).

3) Rewrite a hot inner loop using pointers instead of .countUntil.

4) Refactor the code to eliminate a redundant computation from an inner
   loop.

5) Judicious use of .assumeSafeAppend to prevent excessive array
   reallocations.

6) (Not described in the thread, but applied later) Reduce GC load even
   further by reusing an array that was being allocated per iteration in
   an inner loop before.

Of the above, (1), (2), (3), and (5) require only very small code
changes. (4) and (6) were a little more tricky, but were pretty
localised changes that did not take a long time to implement or affect a
lot of code.  They were all implemented in a short span of 2-3 days.

Compare this with outright writing @nogc code, which would require a LOT
more time & effort.


> The blog post I wrote a few years ago focuses on these ideas:
> https://dlang.org/blog/2017/05/24/faster-command-line-tools-in-d/

Very nice, and matches my experience with optimizing D code.


T

-- 
Век живи - век учись. А дураком помрёшь.


More information about the Digitalmars-d-learn mailing list