Global thread-local free-list allocator

Witold witold.baryluk+dlang at gmail.com
Thu Mar 16 20:53:28 UTC 2023


I ported a classic DeltaBlue benchmark to D, and for allocations 
I use a mix of malloc/free (for structs with some trailing inline 
arrays), and new for some other structs and arrays.

After running it, it felt kind of slow (I did not compare it to 
any other language yet, so no idea if it is actually slower, just 
feels slow), so I did some profiling (perf / hotspot mostly), and 
noticed that substantial amount of time is spent in gc (both 
allocation and collection). This is on Linux, and benchmark 
indeed creates a lot of garbage, as I rerun and rerun it in a 
loop.

In the past I used custom per-type allocator using D1 style 
allocator methods, mostly together with hand coded free lists 
(was working fine and was short).

Now I tried to use new `std.experimental.allocator`, and have 
trouble understanding how to use them.

```d
import std.experimental.allocator : allocatorObject, ThreadLocal;
import std.experimental.allocator.building_blocks.free_list : 
FreeList;
import std.experimental.allocator.gc_allocator : GCAllocator;
import std.experimental.allocator.mallocator : Mallocator;
import std.experimental.allocator.common : unbounded;
// static auto variableAllocator = 
allocatorObject(FreeList!(GCAllocator, Variable.sizeof)());
// static auto variableAllocator = 
allocatorObject(FreeList!(GCAllocator, Variable.sizeof, 
unbounded)());
// static auto variableAllocator = 
allocatorObject(FreeList!(Mallocator, Variable.sizeof, 
unbounded)());
alias VariableAllocator = ThreadLocal!(FreeList!(Mallocator, 
Variable.sizeof, unbounded));

// VariableAllocator variableAllocator;
// this() {
//   variableAllocator = VariableAllocator.instance;
//}

struct Variable {
   long value;
   List!Constraint* constraints;
   Constraint* determinedBy;
   long mark;
   Strength walkStrength;
   bool stay;
   // char[10] name;
   string name;

   static Variable* Create(string name, long initialValue) {
     // auto new_ = new Variable();
     auto new_ = VariableAllocator.instance.make!Variable();
     // auto new_ = variableAllocator.make!Variable();
     new_.value = initialValue;  // CRASH
     ...
     return new_;
   }
   static void Destroy(Variable* v) {
     assert(v.constraints !is null, "bad Variable struct; already 
freed?");
     List!Constraint.Destroy(v.constraints);
     v.constraints = null;
     // free(v);
     // destroy(v);
     VariableAllocator.instance.dispose(v);
   }

   ...
   ...
}
```


It crashes at `new_.value`, as `new_` is `null`

```
Reading symbols from ./t1_gdc_debug...
(gdb) r
Starting program: /home/user/benchy/t1_gdc_debug
[Thread debugging using libthread_db enabled]
Using host libthread_db library 
"/lib/x86_64-linux-gnu/libthread_db.so.1".

Program received signal SIGSEGV, Segmentation fault.
0x000055555557c86a in 
deltablue.Variable.Create(immutable(char)[], long) (name=..., 
initialValue=0) at /home/user/benchy/deltablue.d:248
248	    new_.value = initialValue;
(gdb) bt
#0  0x000055555557c86a in 
deltablue.Variable.Create(immutable(char)[], long) (name=..., 
initialValue=0) at /home/user/benchy/deltablue.d:248
#1  0x000055555557f14e in deltablue.ChainTest(int) (n=100) at 
/home/user/benchy/deltablue.d:901
#2  0x000055555557f88c in deltablue.deltablue(out ulong, out 
ulong) (iters=@0x7fffffffd9b8: 0, flops=@0x7fffffffd9c0: 0)
     at /home/user/benchy/deltablue.d:999
#3  0x00005555555e2b3e in b (__capture=0x7fffffffdba0, name=..., 
expectedResult=0) at /home/user/benchy/main.d:60
#4  0x00005555555e2417 in D main (args=...) at 
/home/user/benchy/main.d:91
(gdb) q

```


I want to use FreeList with parent allocator being Mallocator. 
Never return allocations back to Mallocator, and for each 
instance to be thread-local, AND for accessing thread-local 
instance to not require ANY locks.

Documentation for `std.experimental.allocator` is kind of vague, 
complicated and not very prescriptive.

Help would be appreciated.



More information about the Digitalmars-d-learn mailing list