CTFE Overhaul (Was: RFC: Thrift project proposal (draft))

Sun Mar 27 08:37:53 PDT 2011

On Sun, 27 Mar 2011 08:36:39 -0400, spir <denis.spir at gmail.com> wrote:

> On 03/26/2011 09:57 PM, Don wrote:
>>>> The basic problem with the current implementation of CTFE is that it
>>>> uses copy-on-write. This means that references (including dynamic
>>>> arrays) don't work properly -- they just copy a snapshot of the thing
>>>> they are referencing. This is bug 1330. It also means it burns up  
>>>> memory
>>>> like you wouldn't believe.
>>>
>>> Right.  IIUC there's also no way to free the memory from copies that  
>>> are no
>>> longer referenced.  I can see where this would leak memory like a  
>>> sieve.
>>
>> That's not the big problem, actually. The issue is that x[7]=6;
>> duplicates x, even if x has 10K elements.
>> Now consider:
>>    for(int i=0; i<x.length; ++i) x[i]=3;
>>    // creates 100M new elements!! Should create none, or 10K at most.
>
> Hello Don,
>
> I don't understand your point. I have once implemented a toy dynamic  
> language, using the common trick of boxed elements (à la Lisp). But I  
> wanted to maintain value semantics as standard. A cheap way to do that  
> is copy on write; it is actually cheap since simple, atomic, elements  
> are never copied (since they cannot be changed on place), thus one just  
> just needs to trace complex elements (array-lists & named tuples in my  
> case):
> 	x := [1,2,3]	// create the array value, assign its ref
> 	y := x		// copy the ref, mark the value as shared
> 	x[1] := 0	// copy the value, reassign the ref, then change
> But the new value is not shared, thus:
> 	x[1] := 1	// change only
>
> So that in your loop example, at most one array copy happens (iff it was  
> shared). This is as far as I know what is commonly called copy-on-write.  
> There is no need to copy the value over and over again on every change  
> if it is not multiple-referenced, and noone does that, I guess.
>
> Side-Note: assignments of the form of "y := x" are really special, at  
> least conceptually; but also practically when pointers or refs enter the  
> game. I call them "symbol assignments" as the source is a symbol.
>
> Denis

Hi Denis,
What Don is explaining is not how you should implement copy-on-write,  
etc., but the actual implementation of arrays in DMD's CTFE system. Right  
now, any access to an array in CTFE causes the entire array to be  
duplicated, which is a major memory and performance issue, to say nothing  
of the fact that D arrays are supposed to have reference, not value  
semantics. I don't know how or why this behavior was ever introduced, only  
that it is awesome that Don is fixing it.