[Issue 17914] [Reg 2.075] Fibers guard page uses a lot more memory mappings

d-bugmail at puremagic.com d-bugmail at puremagic.com
Fri Oct 27 09:32:09 UTC 2017


https://issues.dlang.org/show_bug.cgi?id=17914

Nemanja Boric <4burgos at gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |4burgos at gmail.com

--- Comment #7 from Nemanja Boric <4burgos at gmail.com> ---
(In reply to briancschott from comment #5)
> > The problem is, this either break toy code that just
> > creates lots of useless Fibers or heavy production code
> > that actually needs 30K Fibers.
> 
> That's not entirely accurate. I found this because the range implementation
> for a complex multi-dimensional tree structure was implemented as a
> std.concurrency.Generator. The problem is caused by doing many tree
> traversals in a loop on a machine that has 512GB of memory. We don't
> actually need 30k fibers. We need one fiber at a time 30k times in a row. By
> the time we start using the second fiber we don't need the first one anymore.
> 
> It's possible to re-implement the range in question, but the amount of code
> required to do so is much greater than the current implementation using
> recursion and fibers.

Is it not possible just to use a simple pool for the Fibers? So, instead of
allocating a new fiber, you reset the recycled one to the new state. Is this
not possible with the std.concurrency.Generator?

> If we really need a comprise here, I'd definitely increase the stack size further, e.g. to 64KiB (vibe.d's default).

This could be a problem on 32bit systems, where this would suck available
address space very quickly, if I'm not mistaken. Of course, we could still make
it a default for 64bit systems.

> A possible idea is putting a pattern in the guard data (leave it read/write), and check the guard data on fiber deallocation, and optionally on every yield (could be a performance issue, but maybe only used in debug mode?)

One problem I can see with this is that it doesn't show the problem on the spot
- good thing about protected region is that we die as soon as we try accessing
it. If the fiber does a lot of calls before yields, it may be that we're far
from the place that actually corrupted memory, and it's not that easy to debug.
Of course, it still prevents memory corruption (but at a cost that grows
linearly with the number of fibers).

> I'd err on the side of memory safety. Stack overflows are an actual problem with Fibers and hard&time-consuming to debug.

Yes. I can't emphasize enough how many times we had these issues. And it's
never been a problem for us to confirm that this is what's happening with the
guard page in. It was always a hell to confirm what's going on with the
previous runtime, mostly because of the fact that it's not the fiber that's
actually corrupting memory the one it will crash, but some other poor fiber
that happens to have a stack adjacent to the rouge one.

Having said all this - why not just keep the default guard page on, but if
somebody is confident they don't need it (and they could even do a
`version(debug){}else{guardPageOff}`), simply providing 0 for the guard size in
the Fiber's constructor will turn this entire thing off? Perhaps we could
document this better (as I didn't expect this issue to raise when implementing
this, not much documentation is there).

However, I also thing Steven has a valid point when he says:

> Except this breaks existing code that may not have a stack overflow problem. I hate to say it, but unfortunately, we are stuck with the status quo for now.

--


More information about the Digitalmars-d-bugs mailing list