Avoid gratuitous closure allocations
Ali Çehreli
acehreli at yahoo.com
Fri Sep 20 11:21:22 UTC 2019
tl;dr Instead of returning an object that uses local state, return an
object that uses member variables.
We've discovered one such allocation inside std.format.sformat today
during our local meetup[1], started fixing it, and discovered that it
has already been fixed by ag0aep6g just 19 days ago after a forum
discussion[2]. Awesome! :)
Although the sizes of such closures are usually small, any garbage
collection allocation can have a big impact on program performance
especially in multi-threaded programs due to D's current stop-the-world
GC scheme. It is so easy to fall into this pessimization that I've used
one in a recent forum post[3] myself.
For example, here is a range function that mimicks std.range.enumerate
with the help of a Voldemort type[4]:
import std.stdio;
import std.range;
import std.typecons;
auto enumerated(R)(R range) {
size_t i;
struct Range {
auto empty() { return range.empty; }
auto front() { return tuple(i, range.front); }
auto popFront() { range.popFront(); ++i; }
}
// The returned object requires a closure because it uses
// 'range' and 'i' from the local context. OUCH OUCH OUCH!
return Range();
}
void main() {
// Aside: Tuple expansion special "feature" of D (as 'i' and 'e' here)
foreach (i, e; iota(42, 50).enumerated) {
writefln!"%s: %s"(i, e);
}
}
If you compile the program with the -profile=gc command line switch (I
used dmd), you will see a log file created in the current directory:
profilegc.log. That file will point to 1 GC allocation of 32 bytes
inside your source code. That allocation may seem trivial but if you
used enumerated() multiple times e.g. in an inner loop, you would be
"stoping the world" many times unnecessarily.
The solution is trivial in this case:
1) Move all local state to the struct; i.e. define a 'range' member
variable inside the struct and make 'i' a member variable of the struct.
Note: Copying the 'range' parameter to a member variable and consuming
that member variable instead may behave differently for some range types
but it's off-topic for this discussion. :)
2) Although there would be no closure allocations after step 1, define
the struct as 'static' to guarantee that it will stay that way even
after changing the code in the future.
The new function is the following:
auto enumerated(R)(R range) {
static struct Range { // 2) Defined as 'static'
R range; // 1a) Member variable
size_t i; // 1b) Member variable
auto empty() { return range.empty; }
auto front() { return tuple(i, range.front); }
auto popFront() { range.popFront(); ++i; }
}
return Range(range); // 1c) Pass the parameter to the object
}
Now the unnecessary closure allocation is gone.
Ali
[1] https://www.meetup.com/D-Lang-Silicon-Valley/
[2] https://forum.dlang.org/post/tdgiytvqpyxevjtqgbao@forum.dlang.org
[3] https://forum.dlang.org/post/qlovst$okj$1@digitalmars.com
[4] https://wiki.dlang.org/Voldemort_types
More information about the Digitalmars-d-learn
mailing list