std.parallelism changes done

Thu Mar 24 18:51:02 PDT 2011

On 3/24/2011 9:15 PM, Sönke Ludwig wrote:
> Am 24.03.2011 13:03, schrieb Michel Fortin:
>> On 2011-03-24 03:00:01 -0400, Sönke Ludwig
>> <ludwig at informatik.uni-luebeck.de> said:
>>
>>> Am 24.03.2011 05:32, schrieb dsimcha:
>>>> In addition to improving the documentation, I added
>>>> Task.executeInNewThread() to allow Task to be useful without a
>>>> TaskPool.
>>>> (Should this have a less verbose name?)
>>>
>>> The threading system I designed for the company I work for uses
>>> priority per task to control which tasks can overtake others. A
>>> special priority is out-of-bands (the name my be debatable), which
>>> will guarantee that the task will run in its own thread so it can
>>> safely wait for other tasks. However, those threads that process OOB
>>> tasks are also cached in the thread pool and reused for new OOB tasks.
>>> Only if the number of parallel OOB tasks goes over a specific number,
>>> new threads will be created and destroyed. This can safe quite a bit
>>> of time for those tasks.
>>>
>>> Both kinds of priority have been very useful and I would suggest to
>>> put at least the executeInNewThread() method into ThreadPool to be
>>> later able to make such an optimization.
>>>
>>> The task priority thing in general may only be necessary for complex
>>> applications with user interaction, where you have to statisfy certain
>>> interactivity needs. I wouldn't be too sad if this is not implemented
>>> now, but it would be good to keep it in mind as a possible improvement
>>> for later.
>>
>> Do you think having multiple task pools each with a different thread
>> priority would do the trick? Simply put tasks in the task pool with the
>> right priority... I had a similar use case in mind and this is what I
>> proposed in the previous discussion.
>>
>
> Yes, that may be actually enough because although you would normally
> want to avoid the overhead of the additional threads running in
> parallel, in the scenarios I have in mind you always have unrelated
> things in different priority classes. An for these different tasks it
> should only be an exception to run in parallel (otherwise using
> priorities would be strange in the first place).
>
> The only thing that is a bit of a pity is that now you have to manage
> multiple thread pools instead of simply using the one singleton instance
> in the whole application. And this could really cause some headaches if
> you have a lot of different types of workload that may all have
> different priorities but also may have the same - you would somehow have
> to share several thread pools across those types of workload.
>
> (type of workload = copying files, computing a preview images, computing
> some physics calcs etc)

My main concern here is that these kinds of use cases are getting far 
beyond the scope of std.parallelism.  By definition (at least as I 
understand it) parallelism is focused on throughput, not 
responsiveness/latency and is about utilizing as many execution 
resources as possible for useful work.  (This article, originally posted 
here by Andrei, describes the distinction nicely: 
http://existentialtype.wordpress.com/2011/03/17/parallelism-is-not-concurrency/) 
  If you're implementing parallelism, then it is correct to only use one 
thread on a single-core machine (std.parallelism does this by default), 
since one thread will utilize all execution resources.  If you're 
implementing concurrency, this is not correct.  Concurrency is used to 
implement parallelism, but that's different from saying concurrency _is_ 
parallelism.

When you start talking about application responsiveness, prioritization, 
etc., you're getting beyond _parallelism_ and into general-case 
concurrency.  I have neither the expertise nor the desire to build a 
general case concurrency library. D already has a general case 
concurrency library (std.concurrency), and this might be a better place 
to implement suggestions dealing with general-case concurrency.

std.parallelism was designed from the ground up to focus on parallelism, 
not general-case concurrency.  I don't mind implementing features useful 
to general-case concurrency if they're trivial in both interface and 
implementation, but I'd rather not do any that require major changes to 
the interface or implementation.