purity and memory allocations/pointers
Timon Gehr
timon.gehr at gmx.ch
Sat Aug 3 09:47:51 PDT 2013
On 08/03/2013 05:59 PM, monarch_dodra wrote:
> I'm a bit confused about where we draw the line with purity, and memory.
>
> For example, take this "simple" function:
>
> //----
> char[] getMutableHello() pure
> {
> return "hello".dup;
> }
> //----
>
> This seems like it will *always* create the same result, yet at the same
> time, it calls the GC. "dup" could fail, amongst others, meaning there
> is failed purity.
>
> Is this OK?
>
Since a pure function call can overflow the stack, I don't really see
this particular point as an issue. (D is Turing complete, unlike the
machines it runs on.)
> :::::::::::::::::::::::::::::::::::::::::
>
> Where does purity mean when you are talking about slices/pointers?
> Simply that they *point*/*reference* the same payload? I mean:
>
> void main()
> {
> auto a = getMutableHello();
> auto b = getMutableHello();
> assert(a == b); //OK
> assert(a is b); //Derp :/
> }
>
> Did getMutableHello just violate its purity promise?
> ...
Well, no. It is by design. All purity really means is that the part of
the existing store not reachable by following references in the
arguments will not be mutated.
What is unfortunate in a sense is that 'is' can be used on immutable
references. This often precludes common subexpression elimination for
pure function calls even though such an optimization would be valid
otherwise.
> :::::::::::::::::::::::::::::::::::::::::
>
> Another thing that bothers me, again, when interfacing with the GC, is
> "capacity". It is currently marked as pure, but I believe this is wrong:
> The GC can in-place extend the "useable" memory of a dynamic array. This
> means all referencing slices will be affected, yet un-modified. This
> means that calling "s.capacity" depends on the global state of the GC:
>
> //----
> auto s1 = new ubyte[](12_000);
> auto bck = s1;
> size_t cap = s1.capacity;
> s1.reserve(13_000);
> assert(s1 is bck); //Apparently, s1 is not changed.
> assert(cap == s1.capacity); //This fails, yet capacity is pure?
> //----
>
> Again, is this OK?
> ...
Well, not really.
IMO the issue here is that if D had a formal semantics, it would likely
be indeterministic due to the GC interface. To make the execution of
pure functions deterministic (which is not currently part of their
guarantees), all operations that can witness indeterminism would need to
be banned in pure functions (eg. ~=, ordering addresses, traversing hash
tables etc.) It is not really possible to get rid of the non-determinism
completely while keeping any kind of allocations, since allocations will
return nondeterministic memory addresses. (Unless the formal semantics
fully specifies the GC implementation. :o) )
Another stance that can be taken is that it is not really global state,
since it is referenced by the slice. Then the above would be ok. (But
semantically ugly.) Under that point of view, an issue is that capacity
violates the transitivity of immutable.
> :::::::::::::::::::::::::::::::::::::::::
>
> One last question: Pointers.
>
> int get(int* p) pure
> {
> return *p;
> }
>
> void main()
> {
> int i = 0;
> auto p = &i;
> get(p);
> }
>
> Here, get, to me, is obviously not pure, since it depends on the state
> of the global "i". *Where* did "get" go wrong? Did I simply "abusively"
> mark get as pure? Is the "pure" keyword's guarantee simply "weak"?
> ...
Yes, it's weak.
> :::::::::::::::::::::::::::::::::::::::::
>
> To sum up the question, regardless of how the *keyword* pure currently
> works, my question is "*How* should it behave"? We're currently marking
> things in Phobos as pure, and I'm worried the only reason we are doing
> it is that the keyword is lenient, and said functions *aren't* actually
> pure...
I guess adding the requirement that a pure function cannot witness its
own indeterminism is required to make the keyword legitimate.
More information about the Digitalmars-d
mailing list