Would you pay for GC?

Fri Jan 28 13:14:16 UTC 2022

On Friday, 28 January 2022 at 10:18:32 UTC, IGotD- wrote:
> On Wednesday, 26 January 2022 at 06:20:06 UTC, Elronnd wrote:
>>
>> Thread-local gc is a thing.  Good for false sharing too 
>> (w/real threads); can move contended objects away from owned 
>> ones.  But I see no reason why fibre-local heaps should need 
>> to be much different from thread-local heaps.
>>
>
> I would like to challenge the idea that thread aware GC would 
> do much for performance. Pegging memory to one thread is 
> unusual and doesn't often correspond to the reality.
>
> For example a computer game with large amount of vertex data 
> where you decide to split up the workload on several threads. 
> You don't make a thread local copy of that data but keep the 
> original vertex data global and even destination buffer would 
> be global.

Which is why you would want ARC for shared objects and a local GC 
for tasks/actors.

Then what you need for more flexibility and optimization is 
static analysis that determines if local objects can be turned 
into shared objects. If that is possible you could put them in a 
separate region of the GC heap with space for a RC field at 
negative offset.

> What I can think of is a server with one thread per client with 
> data that no other reason thread works on.

It shouldn't be per thread, but per actor/task/fiber.

> My experience is that this thread model isn't good programming 
> and servers should instead be completely async meaning any 
> thread might handle the next partial work.

You have experience with this model? From where?

Actually, it could be massively beneficial if you have short 
lived actors and most objects have trivial destructors. Then you 
can simply release the entire local heap with no scanning.

You basically get to configure the system to use arena-allocators 
with GC-fallback for out-of-memory situations. Useful for actors 
where most of the memory it holds are released towards the end of 
the actor's life time.

> As I see it thread aware GC doesn't do much for performance but 
> complicates it for the programmer.

You cannot discuss performance without selecting a particular 
realistic application. Which is why system level programming 
requires multiple choices and configurations if you want 
automatic memory management. There is simply no model that works 
well with all scenarios.

What is needed for D is to find a combinations that works both 
for current high level programming D-users and also makes 
automatic memory management more useful in more system level 
programming scenarios.

Perfect should be considered as out-of-scope.