std.concurrency and fibers

Thu Oct 4 05:53:43 PDT 2012

On 04-10-2012 14:48, Timon Gehr wrote:
> On 10/04/2012 02:22 PM, Alex Rønne Petersen wrote:
>> On 04-10-2012 14:11, Timon Gehr wrote:
>>> On 10/04/2012 01:32 PM, Alex Rønne Petersen wrote:
>>>> Hi,
>>>>
>>>> We currently have std.concurrency as a message-passing mechanism. We
>>>> encourage people to use it instead of OS threads, which is great.
>>>> However, what is *not* great is that spawned tasks correspond 1:1 to OS
>>>> threads. This is not even remotely scalable for Erlang-style
>>>> concurrency. There's a fairly simple way to fix that: Fibers.
>>>>
>>>> The only problem with adding fiber support to std.concurrency is that
>>>> the interface is just not flexible enough. The current interface is
>>>> completely and entirely tied to the notion of threads (contrary to what
>>>> its module description says).
>>>>
>>>> Now, I see a number of ways we can fix this:
>>>>
>>>> A) We completely get rid of the notion of threads and instead simply
>>>> speak of 'tasks'. This trivially allows us to use threads, fibers,
>>>> whatever to back the module. I personally think this is the best way to
>>>> build a message-passing abstraction because it gives enough
>>>> transparency
>>>> to *actually* distribute tasks across machines without things breaking.
>>>> B) We make the module capable of backing tasks with both threads and
>>>> fibers, and expose an interface that allows the user to choose what
>>>> kind
>>>> of task is spawned. I'm *not* convinced this is a good approach because
>>>> it's extremely error-prone (imagine doing a thread-based receive inside
>>>> a fiber-based task!).
>>>> C) We just swap out threads with fibers and document that the module
>>>> uses fibers. See my comments in A for why I'm not sure this is a good
>>>> idea.
>>>>
>>>> All of these are going to break code in one way or another - that's
>>>> unavoidable. But we really need to make std.concurrency grow up; other
>>>> languages (Erlang, Rust, Go, ...) have had micro-threads (in some form)
>>>> for years, and if we want D to be seriously usable for large-scale
>>>> concurrency, we need to have them too.
>>>>
>>>> Thoughts? Other ideas?
>>>>
>>>
>>> +1, but what about TLS?
>>
>> I think that no matter what we do, we have to simply say "don't do that"
>> to thread-local state (it would break in distributed scenarios too, for
>> instance).
>>
>> Instead, I think we should do what the Rust folks did: Use *task*-local
>> state and leave it up to std.concurrency to figure out how to deal with
>> it. It won't be as 'seamless' as TLS variables in D of course, but I
>> think it's good enough in practice.
>>
>
> If it is not seamless, we have failed. IMO the runtime should expose an
> interface for allocating TLS, switching between TLS instances and
> destroying TLS.

I suppose it could be done.

But keep in mind the side-effects of an approach like this: Some 
thread-local variables (for instance, think 'chunk' inside emplace) 
would break (or at least behave very weirdly) if you switch the *entire* 
TLS context when entering a task.

Sure, we could use the runtime interface for TLS switching only for 
task-local state, but then we're back to square one with it not being 
seamless.

>
> What about the stack? Allocating a fixed-size stack per task is costly
> and Walter opposes dynamic stack growth.

Yeah, I never understood why. It's essential for functional-style code 
running in constrained tasks. It's not just about conserving memory; 
it's to make recursion feasible.

In any case, fibers currently allocate PAGE_SIZE * 4 bytes for stacks.

-- 
Alex Rønne Petersen
alex at lycus.org
http://lycus.org