Proposal: Relax rules for 'pure'

Don nospam at nospam.com
Mon Oct 25 19:07:36 PDT 2010


Bruno Medeiros wrote:
> On 23/09/2010 23:39, Robert Jacques wrote:
>> On Thu, 23 Sep 2010 16:35:23 -0400, Tomek Sowiński <just at ask.me> wrote:
>>>
>>> On topic: this means a pure function can take a reference to data that
>>> can be mutated by
>>> someone else. So we're giving up on the "can parallelize with no
>>> dataraces" guarantee on
>>> pure functions?
>>>
>>
>> In short, No. In long; the proposal is for pure functions become broken
>> up into two groups (weak and strong) based on their function signatures.
>> This division is internal to the compiler, and isn't expressed in the
>> language in any way. Strongly-pure functions provide all the guarantees
>> that pure does today and can be automatically parallelized or cached
>> without consequence. Weakly-pure functions, don't provide either of
>> these guarantees, but allow a much larger number of functions to be
>> strongly-pure. In order to guarantee a function is strongly pure, one
>> would have to declare all its inputs immutable or use an appropriate
>> template constraint.
> 
> I think we need to be more realistic with what kinds of optimizations
> we could expect from a D compiler and pure functions.
> Caching might be done, but only a temporary sense (caching under a 
> limited execution scope). I doubt we would ever have something like 
> memoization, which would incur memory costs (potentially quite big 
> ones), and so the compiler almost certainly would not be able to know 
> (without additional metadata/anotations or compile options) if that 
> trade-off is acceptable.
> 
> Similarly for parallelism, how would the compiler know that it's ok to 
> spawn 10 or 100 new threads to parallelize the execution of some loop?
> The consequences for program and whole-machine scheduling would not be 
> trivial and easy to understand. For this to happen, amongst other things 
> the compiler and OS would need to ensure that the spawned threads would 
> not starve the rest of the threads of that program.
> I suspect all these considerations might be very difficult to guarantee 
> on a non-VM environment.

I agree with this, especially with regard to memoization.
However, several other very interesting optimisations are
possible with pure, especially with regard to memory allocation.

At exit from an immutably pure function, all memory allocated by that 
function and its subfunctions can be collected, except for anything 
which is reachable through the function return value.
Even for a weakly-pure function, the set of roots for gc is limited to
the mutable function parameters and the return value.
And for all pure functions with no mutable reference parameters, it is 
guaranteed that no other thread has access to them.

The implications for gc are very interesting.


More information about the Digitalmars-d mailing list