Avoid gratuitous closure allocations

Ali Çehreli acehreli at yahoo.com
Fri Sep 20 11:21:22 UTC 2019


tl;dr Instead of returning an object that uses local state, return an 
object that uses member variables.

We've discovered one such allocation inside std.format.sformat today 
during our local meetup[1], started fixing it, and discovered that it 
has already been fixed by ag0aep6g just 19 days ago after a forum 
discussion[2]. Awesome! :)

Although the sizes of such closures are usually small, any garbage 
collection allocation can have a big impact on program performance 
especially in multi-threaded programs due to D's current stop-the-world 
GC scheme. It is so easy to fall into this pessimization that I've used 
one in a recent forum post[3] myself.

For example, here is a range function that mimicks std.range.enumerate 
with the help of a Voldemort type[4]:

import std.stdio;
import std.range;
import std.typecons;

auto enumerated(R)(R range) {
   size_t i;

   struct Range {
     auto empty()    { return range.empty; }
     auto front()    { return tuple(i, range.front); }
     auto popFront() { range.popFront(); ++i; }
   }

   // The returned object requires a closure because it uses
   // 'range' and 'i' from the local context. OUCH OUCH OUCH!
   return Range();
}

void main() {
   // Aside: Tuple expansion special "feature" of D (as 'i' and 'e' here)
   foreach (i, e; iota(42, 50).enumerated) {
     writefln!"%s: %s"(i, e);
   }
}

If you compile the program with the -profile=gc command line switch (I 
used dmd), you will see a log file created in the current directory: 
profilegc.log. That file will point to 1 GC allocation of 32 bytes 
inside your source code. That allocation may seem trivial but if you 
used enumerated() multiple times e.g. in an inner loop, you would be 
"stoping the world" many times unnecessarily.

The solution is trivial in this case:

1) Move all local state to the struct; i.e. define a 'range' member 
variable inside the struct and make 'i' a member variable of the struct.

Note: Copying the 'range' parameter to a member variable and consuming 
that member variable instead may behave differently for some range types 
but it's off-topic for this discussion. :)

2) Although there would be no closure allocations after step 1, define 
the struct as 'static' to guarantee that it will stay that way even 
after changing the code in the future.

The new function is the following:

auto enumerated(R)(R range) {
   static struct Range {    // 2) Defined as 'static'
     R range;               // 1a) Member variable
     size_t i;              // 1b) Member variable

     auto empty()    { return range.empty; }
     auto front()    { return tuple(i, range.front); }
     auto popFront() { range.popFront(); ++i; }
   }

   return Range(range);     // 1c) Pass the parameter to the object
}

Now the unnecessary closure allocation is gone.

Ali

[1] https://www.meetup.com/D-Lang-Silicon-Valley/
[2] https://forum.dlang.org/post/tdgiytvqpyxevjtqgbao@forum.dlang.org
[3] https://forum.dlang.org/post/qlovst$okj$1@digitalmars.com
[4] https://wiki.dlang.org/Voldemort_types


More information about the Digitalmars-d-learn mailing list