Lets talk about fibers
Liran Zvibel via Digitalmars-d
digitalmars-d at puremagic.com
Wed Jun 3 11:34:33 PDT 2015
Hi,
We discussed (not) moving fibers between threads on DConf last
week, and later it was discussed in the announce group, I think
this matter is important enough to get a thread of it's own.
Software fibers/coroutines were created to make asynchronous
programming using a Reactor (or another "event loop i/o
scheduler") more seamless.
For those unaware of the Reactor Pattern, I advise reading [
http://en.wikipedia.org/wiki/Reactor_pattern ;
http://www.dre.vanderbilt.edu/~schmidt/PDF/reactor-siemens.pdf ],
and for some perspective at how other languages have addressed
this I recommend watching Guido Van Rossum's talk about acyncio
and Python: https://www.youtube.com/watch?v=aurOB4qYuFM
The Reactor pattern is a long-time widely accepted way to achieve
low latency async io operations, that fortunately became famous
thanks to the Web and the C10k requirement/problem. Using the
Reactor is the most efficient way to leverage current CPU
architectures to perform lots of IO for many reasons outside of
this scope.
Another very important quality to using a rector based approach,
is that since all event handlers just serialize on a single IO
scheduler ("the reactor") on each thread, if designed correctly
programmers don't have to think about concurrency and care about
code-races.
Another thing to note: when using the reactor pattern you have to
make sure that no event handler blocks at all, never! Once an
event-handler blocks, since being a non-preemptive model, the
other event handlers will not be able to run, basically starving
themselves and the clients on the other side of the network.
Reactor implementations usually detect, and notify when an event
handler took too much time until giving away control (this is
dependent on application, but should be in the usec range on
current hw).
The downside for the reactor pattern (used to be) that the
programmer has to manually keep the state/context of how the
event handler worked. Since each "logical" operation was
comprised by many i/o transactions (some NW protocol to keep
track, maybe accessing a networked DB for some data,
reading/writing to local/remote files/ etc) the reactor would
also keep a context for each callback and IO event and the
programmer had to either update the context and keep registering
new event handlers manually for all extra I/O transactions and in
many cases change callback registration in some cases.
This downside means that it's more difficult to program for a
Reactor model, but since programmers don't have to think about
races and concurrency issues (and then debug them...) from our
experience it still more efficient to program than
general-purpose threads if you care about correctness/coherency.
One way so mitigate this complexity was through the Proactor
pattern -- implementing higher-level async. IO services over the
reactor, thus sparing the programmer a lot of the low-level
context headaches.
Up until now I did not say anything about Fibers/coroutines.
What Fibers bring to the table, is the ability to program within
the reactor model without having to manually keep a context that
is separate for the program logic, and without the requirement to
manually re/register callbacks for different IO events.
D's Fibers allowed us to create an async io library with support
for network/file/disk operations and higher level conditions
(waiters, barriers, etc) that allows the programmer to write code
as-if it runs in its own thread (almost, sometimes fibers are
explicitly "spawned" -- added to the reactor, and
fiber-conditions are slightly different than spawning and joining
threads) without paying the huge correctness/coherence and
performance penalties of the threading model.
There are two main reasons why it does not make sense to move
fibers between threads:
1. You'll start having concurrency issues. Lets assume we have a
main fiber that received some request, and it spawns 3 fibers
looking into different DBs to get some info and update an array
with the data. The array will probably be on the stack of the
first fiber. If fibers don't move between threads, there is
nothing to worry about (as expected by the model). If you start
moving fibers across threads you have to start guarding this
array now, to make sure it's still coherent.
This is a simple example, but basically shows that you're
"losing" one of the biggest selling point of the whole reactor
based model.
2. Fibers and reactor based IO make work well (read: make sense)
when you have a situation where you have lots of concurrent very
small transactions (similar to the Web C10k problem or a storage
machine). In this case, if one of the threads has more capacity
than the rest, then the IO scheduler ("reactor") will just make
sure to spawn new fibers accepting new transactions in that
fiber. If you don't have a situation that balancing can be done
via placing new requests in the right place, then probably you
should not use the reactor model, but a different one that suits
your application better.
Currently we can spawn another reactor to take more load, but the
load is balanced statically at a system-wide level. On previous
projects we had several reactors running on different threads and
providing very different functionality (with different handlers,
naturally).
We never got to a situation that moving a fiber between threads
made any sense.
As we see, there is nothing to gain and lots to lose by moving
fibers between threads.
Now, if we want to make sure fibers are well supported in D there
are several other things we should do:
1. Implement a good asyncIO library that supports fiber based
programming. I don't know Vibe.d very well (e.g. at all), maybe
we (Weka.IO) can help review it and suggest ways to make it into
a general async IO library (we have over 15 years experience
developing with the reactor model in many environments)
2. Adding better compiler support. The one problem with fibers is
that upon creation you have to know the stack size for that
fiber. Different functions will create different stack depths. It
is very convenient to use the stack to hold all objects (recall
Walter's first day talk, for example), and it can be used as very
convenient way to "garbage collect" all resources added during
the run of that fiber, but currently we don't leverage it to the
max since we don't have a good way to know/limit the amount of
memory used this way.
If the compiler will be able to analyze stack usage by functions
(recursively) and be able to give us hints regarding the
upper-bounds of stack usage, we will be able to use the stack
more aggressively and utilize memory much better.
Also -- I think such static analysis will be a big selling point
for D for systems like ours.
I think now everything is written down, and we can move the
discussion here.
Liran.
More information about the Digitalmars-d
mailing list