Memory allocation in D (noob question)

Regan Heath regan at netmail.co.nz
Wed Dec 5 06:51:12 PST 2007


Steven Schveighoffer wrote:
> "Regan Heath" wrote
>> Both variables above are references to the same data.  You're using one 
>> variable to change that data, therefore the other variable which still 
>> refers to the same data, sees the changes.
>>
>> If the concatenation operation had to reallocate the memory it would 
>> produce a copy, and you wouldn't see the changes.
>>
>> So, this behaviour is non deterministic, however...
> 
> The problem is that invariant data is changing.  This is a no-no for pure
> functions which Walter has planned.  If invariant data can change without 
> violating the rules of the spec, then the compiler implementation or design 
> is flawed.  I think the design is what is flawed.

That is another issue which I didn't even address.

Assuming 'string' means 'invariant(char)' and assuming that means the 
char values cannot change (I say assuming because I haven't had the 
chance to really internalise the new const yet) then I reckon the 
implementation of invariant is simply broken/buggy.

> I have several problems with this concat operator issue.
> 
> First, that x ~= y does not effect the same behavior as x = x ~ y.  This is 
> a fundamental flaw in the language in my opinion.  any operator of the op= 
> form is supposed to mean the same as x = x op y.  This is consistent 
> throughout all of D, except in this case.

The problem stems from the fact that x ~= y always assigns the result to 
x, whereas x ~ y can potentially be assigned to something else.  This 
means the latter must create a new/temporary object to store the result.

In the case of arrays this effectively means that x ~ y always creates a 
new array which is a copy of the old ones.  But x ~= y need not create a 
new array as it can append to the existing one.

The ~= form therefore allows an optimisation which is beneficial.  Not 
allowing people to have both methods at their disposal would likely 
cause an outcry.

> Second, there is the issue of the spec.  The spec clearly states that 
> concatenation should result in a copy of both sides.  

1. The website cannot be trusted completely and is often behind the 
compiler when it comes to the spec.

2. It could be argued that "concatenation" is the x ~ y form and not the 
~= form, which is called "append".  From the website spec:

"The binary operator ~ is the cat operator. It is used to concatenate 
arrays"

"Similarly, the ~= operator means append"

"Concatenation always creates a copy of its operands, even if one of the 
operands is a 0 length array"

I'm probably splitting hairs here and I doubt there is much point 
arguing it - I just wanted to point out another way of reading the spec.

 > Obviously, this isn't
> true in all cases.  The spec should be changed for both D 1.x and 2.x 
> IMMEDIATELY to prevent unsuspecting coders from using ~= when what they 
> really want is just ~.
>
> Third, I have not seen this T[new] operator described anywhere, but I am 
> concerned that D 1.0 will not be updated.  This leaves all coders who are 
> not ready to switch to D 2 at risk.  But from the inferred behavior of 
> T[new], I'm expecting that this will probably fix the problem.

Aside from the apparent invariant bug the only case which causes me a 
slight worry is the case involving a struct.  The only solution I can 
imagine would be to somehow determine the memory was originally 
allocated to a 'struct' and therefore reallocation for an 'array' must 
cause a copy.

I'm not sure what information the GC keeps on allocated blocks, I 
believe there is a pointers/nopointers flag and that could form the 
basis of a fairly crude test perhaps (as struct contains pointers and 
char[] does not);

Even if nothing can be done to detect this case I'm not sure it's a huge 
issue, after all it only affects people using static arrays as the first 
member of a struct which they take a slice of and then modify 
(concatenate) without performing "copy on write" - which is a no no in D 
anyway.

Regan



More information about the Digitalmars-d mailing list