Possible change to array runtime?

Fri Mar 14 10:38:06 PDT 2014

On Fri, 14 Mar 2014 11:58:06 -0400, Don <x at nospam.com> wrote:

> On Friday, 14 March 2014 at 14:48:13 UTC, Steven Schveighoffer wrote:
>> On Thu, 13 Mar 2014 11:24:01 -0400, Steven Schveighoffer  
>> <schveiguy at yahoo.com> wrote:
>>
>>
>>> arr.length = 0;
>>>
>> ...
>>> 3. Don's company uses D1 as its language, I highly recommend watching  
>>> Don's Dconf13 presentation (and look forward to his Dconf14 one!) to  
>>> see how effective D code can create unbelievable speed, especially  
>>> where array slices are concerned. But to the above line, in D2, they  
>>> must add the following code to get the same behavior:
>>>
>>> arr.assumeSafeAppend();
>>
>> Just a quick note, buried in same thread that Don mentioned, he  
>> outlined a more specific case, and this does not involve setting length  
>> to 0, but to any arbitrary value.
>>
>> This means my approach does not help them, and although it makes sense,  
>> the idea that it would help Sociomantic move to D2 is not correct.
>>
>> -Steve
>
> Actually it would help a great deal. In most cases, we do set the length  
> to 0. That example code is unusual.

If that example is not usual, it means that case is easy to fix (as I  
stated in the other post). Can you estimate how many ways your code  
contracts the length of an array? I assume all of them must be fully  
referencing the block, since that was the requirement in D1 (the slice had  
to point at the beginning of the block for append to work).

I imagine that adding this 'feature' of setting length = 0 would help, but  
maybe just adding a new, but similar symbol for length that means "Do what  
D1 length would have done" would be less controversial for adding to  
druntime. Example strawman:

arr.slength = 0; // effectively the same as arr.length = 0;  
arr.assumeSafeAppend();

It would do the same thing, but the idea is it would work for extension  
too -- if arr points at the beginning of the block, and slength *grows*  
into the block, it would work the same as D1 as well -- adjusting the  
"used" length and not reallocating.

Essentially, you would have to s/.length/.slength, and everything would  
work. Of course, length is not a keyword, so there may be other cases  
where length is used (read property for instance) where slength would not  
necessarily have to be used.

However, one thing we cannot fix is:

arr = arr[0..$-1];
arr ~= x;

This would reallocate due to stomp prevention, and I can't fix that. Do  
you have any cases like this, or do you always use .length?

> FYI: In D1, this was the most important idiom in the language.
> In the first D conference in 2007, a feature T[new] was described,  
> specifically to support this idiom in a safe manner. Implementation was  
> begun in the compiler. Unfortunately, it didn't happen in the end. I'm  
> not sure if it would actually have worked or not.

I think the benefits would have been very minimal. T[new] would be almost  
exactly like T[]. And you still would have to have updated all your code  
to use T[new]. When to use T[new] or T[] would have driven most people mad.

I was around back then, and I also remember Andrei, when writing TDPL,  
stating that the T[new] and T[] differences were so subtle and confusing  
that he was glad he didn't have to specify that.

> BTW you said somewhere that concatenation always allocates. What I  
> actually meant was ~=, not ~.

OK

> In our code it is always preceded by .length = 0 though.
> It's important that ~= should not allocate, when the existing capacity  
> is large enough.

That is the case for D2, as long as the slice is qualified as ending at  
the end of the valid data (setting length to 0 doesn't do this,  
obviously). This is a slight departure from D1, which required the slice  
to point at the *beginning* of the data.

-Steve