D array expansion and non-deterministic re-allocation
Steven Schveighoffer
schveiguy at yahoo.com
Tue Dec 1 04:34:58 PST 2009
On Thu, 26 Nov 2009 17:45:30 -0500, Bartosz Milewski
<bartosz-nospam at relisoft.com> wrote:
> Steve, I don't know about you, but this exchange clarified some things
> for me. The major one is that it's dangerous to define the semantics of
> a language construct in terms of implementation. You were defending some
> points using implementation arguments rather than sticking to defined
> semantics.
I was defending the semantics by using an example implementation. I was
not defining the semantics in terms of implementation. The semantics are
defined by the spec, and do not indicate when an array is reallocated and
when it is not. That detail is implementation defined. My examples use
dmd's implementation to show how the assumption can break. You said the
guy needs me to show him that it is broken, and all his tests pass, why
can't I use my knowledge of the implementation to come up with an example?
I could rewrite my statements as: "You should not rely on the array being
reallocated via append, because D does not guarantee such reallocation.
Using the reference implementation of dmd, it is possible to come up with
an example of where this fails: ..."
> We have found out that one should never rely on the array being
> re-allocated on expansion, even if it seems like there's no other way.
> The only correct statement is that the freshly expanded part of the
> array is guaranteed not to be write-shared with any other array.
I agree with this (except for "even if it seems like there's no other
way," The spec says an allocation always occurs when you do a ~ b, so you
can always rewrite a ~= b as a = a ~ b). In fact, at one point to avoid
stomping I went through Tango and found all places where append could
result in stomping, and changed the code this way. There were probably
less than 5 instances. Append is not a very common operation when you
didn't create the array to begin with.
> However, this discussion veered away from a more important point. I
> don't believe that programmers will consciously make assumptions about
> re-allocation breaking sharing.
For the most part, this is ok -- rarely do you see someone append to an
array they didn't create *and* modify the original data.
My belief is that people will expect more that appending an array
*doesn't* reallocate. If you have experience in programming, the language
you are used to either treats arrays as value types or as reference
types. I don't think I've ever seen a language besides D that uses the
hybrid type for arrays. So you are going to come to D expecting value or
reference. If you expect value, you should quickly learn that's not the
case because 99% of the time, arrays look like reference types. It is
natural then to expect appending to an array to affect all other aliases
of that array, after all it is a reference type. I just think your
examples don't ring true in practice because there are simpler ways to
guarantee allocation. You have to go out of your way to write bad code
that doesn't work correctly.
Finally, it's easy to turn an array into a reference type when passing as
a parameter, just use the ref decorator. All we need is a way to turn it
into a value type, and I think Andrei's idea of Value!(arr) would be great
for that.
> The danger is that it's easy to miss accidental sharing and it's very
> hard to test for it.
I think this danger is rare, and it's easy to search for (just search for
~= in your code, I did it with Tango). I think it can be very well
defined in a tutorial or book chapter.
-Steve
More information about the Digitalmars-d
mailing list