Oh, my GoD! Goroutines on D

Mathias LANG geod24 at gmail.com
Tue May 26 01:27:49 UTC 2020


On Monday, 25 May 2020 at 16:26:31 UTC, Jin wrote:
> On Saturday, 16 May 2020 at 20:06:47 UTC, mw wrote:
>> On Tuesday, 29 March 2016 at 17:10:02 UTC, Jin wrote:
>>>
>>> http://wiki.dlang.org/Go_to_D
>>
>> Any performance comparison with Go? esp. in real word scenario?
>>
>> Can it easily handle hundreds of (go)routines?
>
> I have updated the code. But it isn't ready to use currently 
> because:
>
> 1. I rewrote code to use std.parallelism instead of vibe.d. So, 
> it's difficult to integrate fibers with tasks. Now, every tasks 
> spinlocks on waiting channel and main thread don't useful work.
>
> 2. Race condition. I'm going to closely review algorithm.
>
> [...]
>
> It would be cool if someone help me with it. There are 
> docstrings, tests and diagrams. I'll explain more if someone 
> joins.

This is a problem that's of interest to me as well, and I've been 
working on this for a few months (on and off).
I had to eventually ditch `std.concurrency` because of some 
design decisions that made things hard to work with.

`std.concurrency`'s MessageBox were originally designed to be 
only between threads. As such, they come with all the locking 
you'd expect from a cross-thread message-passing data structure. 
Support for fibers was added as an afterthought. You can even see 
it in the documentation 
(https://dlang.org/phobos/std_concurrency.html), where "thread" 
is mentioned all over the place. The module doc kinda makes it 
get away with it because it calls fibers "logical threads", but 
that distinction is not always made. It also have some concept 
that make a lot of sense for threads, but much less so for Fibers 
(such as the "owner" concept, which is the task that `spawn`ed 
you). Finally, it forces messages to be `shared` or isolated 
(read: with only `immutable` indirections), which doesn't make 
sense when you're dealing only with Fibers on the same thread.

We found some ridiculous issues when trying to use it. We 
upstreamed some fixes (https://github.com/dlang/phobos/pull/7096, 
https://github.com/dlang/phobos/pull/6738) and put a bounty on 
one of the issue which led to someone finding the bug in 
`std.concurrency` 
(https://github.com/Geod24/localrest/pull/5#issuecomment-523707490). After some playing around with it, we just gave up and forked the whole module and started to change it to make it behave more like channels. There are some other issues I found while refactoring which I might upstream in the future, but it needs so much work that I might as well PR a whole new module.

What we're trying to achieve is to move from a MessageBox 
approach, where there is a 1-to-1 relationship between a task (or 
logical thread) and a MessageBox, to a channel-like model, where 
there is a N-to-1 relationship (See Go's select).

In order to achieve Go-like performance, we need a few things 
though:
- Direct hand-off semantic for same-thread message passing: 
Meaning that if Fiber A sends a message to Fiber B, and they are 
both in the same thread, there is an immediate context switch 
from A to B, without going through the scheduler;
- Thread-level multiplexing of receive: With the current 
`std.concurrency`, calling `receive` yield the fiber and might 
block the Thread. The scheduler simply iterate over all Fibers in 
a linear order, which means you could end up in a situation 
where, if you have 3 Fibers, and they all `receive` one after the 
other, you'll end up being blocked on the *first* one receiving a 
message to wait the other ones up.
- Smaller Fibers: Goroutine can have very, very small stack. They 
don't stack overflow because they are managed (whenever you need 
to allocate more stack, there use to be a check for stack 
overflow, and stack "regions" were/are essentially a linked list 
and need not be contiguous in memory). On the other hand we use 
simple regular fiber context switching, which is much more 
expensive. In that area, I think exploring the idea of a 
stackless coroutine based scheduler could be worthwhile.

This google doc has a lot of good informations, if you're 
interested: 
https://docs.google.com/document/d/1yIAYmbvL3JxOKOjuCyon7JhW4cSv1wy5hC0ApeGMV9s/pub

It's still a problem we're working on, as some issues are unique 
to D and we haven't found a good solution (e.g. requiring 
`shared` for same-thread Fiber communication is quite 
problematic). If we ever reach a satisfying solution I'll try 
upstreaming it.


More information about the Digitalmars-d mailing list