Processes and Channels, cf. goroutines.

Wed Feb 5 12:37:43 PST 2014

On Wednesday, 5 February 2014 at 15:38:43 UTC, Bienlein wrote:
>
> On a very well equipped machine 10.000 threads is about the 
> maximum for the JVM. Now for D 1.000.000 kernel threads are not 
> a problem!? Well, I'm a D newbie and a bit confused now... Have 
> to ask some questions trying not to bug people. Apparently, a 
> kernel thread in D is not an OS thread. Does D have it's own 
> threading model then? Couldn't see that from what I found on 
> dlang.org. The measurement result for fibers is that much 
> better as for threads, because fibers have less overhead for 
> context switching? Will actors in D benefit from your 
> FiberScheduler when it has been released? Do you know which 
> next version of D your FiberScheduler is planned to be included?

Well, I spawned 1 million threads, but there's no guarantee that
1 million were running concurrently.  So I decided to run a test.
   I forced the code to block until all threads were started, and
when using kernel threads this hung with 2047 threads running
(this is on OSX).  So I think OSX has a hard internal limit of
2047 threads.  It's possible this can be extended somehow, but I
didn't investigate.  And since I don't currently have a great way
to block fibers, what I was doing there was a busy wait, which
was just slow going waiting for all the threads to spin up.

Next I just figured I'd keep a high water mark for concurrent
thread count for the code I posted yesterday.  Both fibers and
kernel threads topped out at about 10.  For fibers, this makes
perfect sense given the yield strategy (each client thread yields
10 times while running).  And I guess the scheduling for kernel
threads made that come out about the same.  So the fact that I
was able to spawn 1 million kernel threads doesn't actually mean
a whole lot.  I should have thought about that more yesterday.
Because of the added synchronization counting threads, everything
slowed down a bit, so I reduced the number of threads to 100.000.
   Here are some timings:

$ time concurrency threads
numThreadsToSpawn = 100000, maxConcurrent = 12

real	1m8.573s
user	1m22.516s
sys	0m27.985s

$ time concurrency fibers
numThreadsToSpawn = 100000, maxConcurrent = 10

real	0m5.860s
user	0m3.493s
sys	0m2.361s

So in short, a "kernel thread" in D (which is equivalent to
instantiating a core.thread.Thread) is an OS thread.  The fibers
are user-space threads that context switch when explicitly
yielded and use core.thread.Fiber.

One thing to note about the FiberScheduler is that I haven't
sorted out a solution for thread-local storage.  So if you're
using the FiberScheduler and each "thread" is accessing some
global static data it expects to be exclusive to itself, you'll
end up with an undefined result.  Making D's "thread-local by
default" actually be fiber-local when using fibers is a pretty
hard problem to solve, and can be dealt with later if the need
arises.  My hope was that by making the choice of scheduler
user-defined however, it's up to the user to choose the
appropriate threading model for their application, and we can
hopefully sidestep the need to sort this out.  It was the main
issue blocking my doing this ages ago, and I didn't think of this
pluggable approach until recently.

The obvious gain here is that std.concurrency is no longer
strictly limited by the overhead of kernel threads, and so can be
used more according to the actor model as was originally
intended.  I can imagine more complex schedulers multiplexing
fibers across a pool of kernel threads, for example.  The
FiberScheduler is more a proof of concept than anything.

As for when this will be available... I will have a pull request
sorted out shortly, so you could start playing with it soon.  It
being included in an actual release means a review and such, but
as this is really just a fairly succinct change to an existing
module, I hope it won't be terribly contentious.

> In Go you can easily spawn 100.000 goroutines (aka green 
> threads), probably several 100.000. Being able to spawn way 
> more than 100.000 threads in D with little context switching 
> overhead as with using fibers you are basically in the same 
> league as with Go. And D is a really rich language contrary to 
> Go. This looks cool :-)

Yeah, I think it's exciting.  I had originally modeled
std.concurrency after Erlang and like the way the syntax worked
out, but using kernel threads is limiting.  I'm interested to see
how this scales once people start playing with it.  It's possible
that some tuning of when yields occur may be needed as time goes
on, but that really needs more eyes than my own and probably
multiple real world tests as well.

As some general background on actors vs. CSP in std.concurrency,
I chose actors for two reasons.  First, the communication model
for actors is unstructured, so it's adaptable to a lot of
different application designs.  If you want structure you can
impose it at the protocol level, but it isn't necessary to do
so--simply using std.concurency requires practically no code at
all for the simple case.  And second, I wasn't terribly fond of
the "sequential" part of CSP.  I really want a messaging model
that scales horizontally across processes and across hosts, and
the CSP algebra doesn't work that way.  At the time, I found a
few algebras that were attempting to basically merge the two
approaches, but nothing really stood out.