forcing weak purity

Wed May 23 08:29:49 PDT 2012

On 23/05/12 15:56, Alex Rønne Petersen wrote:
> On 23-05-2012 15:17, Don Clugston wrote:
>> On 23/05/12 05:22, Steven Schveighoffer wrote:
>>> I have come across a dilemma.
>>>
>>> Alex Rønne Petersen has a pull request changing some things in the GC to
>>> pure. I think gc_collect() should be weak-pure, because it could
>>> technically run on any memory allocation (which is already allowed in
>>> pure functions), and it runs in a context that doesn't really affect
>>> execution of the pure function.
>>>
>>> So I think it should be able to be run inside a strong pure function.
>>
>> I am almost certain it should not.
>>
>> And I think this is quite important. A strongly pure function should be
>> considered to have its own gc, and should not be able to collect any
>> memory it did not allocate itself.
>>
>> Memory allocation from a pure function might trigger a gc cycle, but it
>> would ONLY look at the memory allocated inside that pure function.
>
> Implementing this on a per-function basis is not very realistic. Some
> programs have hundreds (if not thousands) of pure functions.

No, it's not realistic for every function. But it's extremely easy for 
others. In particular, if you have a pure function which has no 
reference parameters, you just need a pointer to the last point a 
strongly pure function was entered. This partitions the heap into two 
parts. Each can be gc'd independently.

And, in the non-pure part, nothing is happening. Once you've done a GC 
there, you NEVER need to do it again.

> Not to mention, we'd need some mechanism akin to critical regions to
> figure out when a thread is in a pure function during stop-the-world.
> Further, data allocated in a pure function f() in thread A must not be
> touched by a collection triggered by an allocation inside f() in thread
> B. It'd be a huge mess.

Not so. It's impossible for anything outside of a strongly pure function 
to hold a pointer to memory allocated by the pure function.
In my view, this is the single most interesting feature of purity.

> And, frankly, if my program dies from an OOME due to pure functions
> being unable to do full collection cycles, I'd just stop using pure
> permanently. It's not a very realistic approach to automatic memory
> management; at that point, manual memory management would work better.

Of course. But I don't see how that's relevant. How the pure function 
actually obtains its memory is an implementation detail.

There's a huge difference between "a global collection *may* be 
performed from a pure function" vs "it *must* be possible to force a 
global collection from a pure function".

The difficulty in expressing the latter is a simple consequence of the 
fact that it is intrinsically impure.