parallel threads stalls until all thread batches are finished.

Sat Aug 26 03:40:58 UTC 2023

On Friday, 25 August 2023 at 21:43:26 UTC, Adam D Ruppe wrote:
> On Wednesday, 23 August 2023 at 13:03:36 UTC, Joe wrote:
>> to download files from the internet.
>
> Are they particularly big files? You might consider using one 
> of the other libs that does it all in one thread. (i ask about 
> size cuz mine ive never tested doing big files at once, i 
> usually use it for smaller things, but i think it can do it)
>
>> The reason why this causes me problems is that the downloaded 
>> files, which are cashed to a temporary file, stick around and 
>> do not free up space(think of it just as using memory) and 
>> this can cause some problems some of the time.
>
> this is why im a lil worried about my thing, like do they have 
> to be temporary files or can it be memory that is recycled?

The downloading is simply a wrapper that provides some caching to 
a ram drive and management of other things and doesn't have any 
clue how or what is being downloaded. It passes a link to 
something like youtube-dl or yt-dlp and has it do the downloaded.

Everything works great except for the bottle neck when things are 
not balancing out. It's not a huge deal since it does work and, 
for the most part, gets everything downloaded but sorta defeats 
the purpose of having multiple downloads(which is much faster 
since each download seems to be throttled).

Increasing the work unit size will make the problem worse while 
reducing it to 1 will flood the downloads(e.g., having 200 or 
even 2000 downloads at once).

Ultimately this seems like a design flaw in ThreadPool which 
should auto rebalance the threads and not treat the number of 
threads as identical to the worker unit size(well, 
length/workerunitsize).

e.g., suppose we have 1000 tasks and set worker unit size to 100. 
This gives 10 workers and 10 workers will be spawned(not sure if 
this is limited to total number of cpu threads or not)

What would be nice is to be able to set worker unit size to 1 and 
this gives 1000 workers but limit concurent workers to, say 10. 
So we would have at any time 10 workers each working on 1 
element. When one gets finished it can be repurposed for any 
unfinished tasks.

The second case is preferable since there should be no issues 
with balancing but one still gets 10 workers. The stalling comes 
from the algorithm design and not anything innate in the problem 
or workload itself.