Efficiency of using array as stack

Sat Mar 23 16:45:21 PDT 2013

On Saturday, 23 March 2013 at 19:33:06 UTC, Jonathan M Davis 
wrote:
> You might want to check out this article where someone ran into 
> similar issues:
>
> https://www.semitwist.com/articles/article/view/don-t-use-arrays-as-stacks
>
> And if you haven't read Steven's article on arrays, you should 
> do so:
>
> http://dlang.org/d-array-article.html

That was a good reading, thank you!  The whole matter became 
clearer for me.

But now, I have one more example which confuses me, and the 
articles don't seem to help right away with it.  In this example, 
I am moving a "sliding window" of length n; now that we're done 
with the stack, here is a simple usage pattern of the queue.  
What I do is repeatedly (n times also) add an element to the back 
of the array and then remove an element from the front of it, 
keeping the whole queue constantly large.  Before the start, I 
reserve c*n entries for my array, where c is a real number 
between 1 and 2.  I track reallocations by monitoring the address 
of an element of a which should remain still as long as the array 
is not reallocated.

-----
import std.algorithm;
import std.array;
import std.range;
import std.stdio;

void main ()
{
	int n = 2_000_000;
	double c = 1.5;
	auto a = array (iota (n));
	a.reserve (cast (int) (n * c));
	auto prev = &a[$ - 1];
	int changes = 0;
	foreach (i; 0..n)
	{
		debug {writeln (a.capacity);}
		a.length++;
		a = a[1..$];
		auto next = &a[$ - i - 1];
		if (prev != next)
		{
			changes++;
			prev = next;
		}
	}
	writeln (changes);
}
-----

Now, if I set c to 2, there are no reallocations: all the work 
goes with the 2*n pre-allocated elements.  But as I decrease c 
down to 1, the number of reallocations grows *linearly*, roughly 
1 per 2000 appends.  So for c = 1, there are 1044 reallocations 
in total.  And that's still quadratic behavior, albeit divided by 
a large constant (~2000)!

What happens (locally) is, once the array size is not enough, it 
starts being grown by blocks of 1024 or 891 (?!) elements in a 
repeating pattern; one of these reallocates, the other does not.  
The textbook growth implementation suggests doubling the array 
size, but it looks like D attempts something smarter here.  
However, in my case, the result is ~8x slower behavior when not 
pre-allocating.  The more is n, the slower is the version without 
pre-allocation (c = 1) compared to the version with 
pre-allocation (c = 2).  And this means that a D array is not 
also an efficient queue out-of-the-box.  However, as in the case 
with stack, it could perhaps be efficient with a bit of tweaking 
(some custom tailored calls to reserve perhaps?).

So, what is the exact strategy of growing an array?  Maybe you 
could just kindly point me to some more interesting reading in 
druntime source?  And, well, now how do I implement a generally 
efficient array-based queue in D?

-----
Ivan Kazmenko.