Tips and Tricks for D

Thu Jul 6 10:40:23 PDT 2006

mclysenk at mtu.edu wrote:
> In article <e8ajmp$o9c$1 at digitaldaemon.com>, Deewiant says...
> 
>>mclysenk at mtu.edu wrote:
>>
>>>You can read it at http://www.assertfalse.com/articles/tricks.shtml .
>>>
>>>Any comments, questions, suggestions or other feedback is welcome!
>>>
>>
>>I've always removed items from an array whilst preserving order in the following
>>way (using variables from your example):
>>
>>queue = queue[0..idx] ~ queue[idx+1..$];
>>
>>I did some test runs (benchmark source can be found at bottom of message), and
>>on my machine, at least, my version is consistently faster - it runs in about
>>60% of the time yours takes.
>>
> 
> 
> First of all, thanks for the feedback!  And second, you do have a good point.
> However, there is a slight bias in your test.  Instead of removing elements from
> arbitrary positions in the array, you are strictly removing elements from the
> first 10000 positions.  In your method, you are copying the beginning of the
> array, whereas in my method I am copying the end of the array.  If you try the
> following benchmark, my version runs about 25-40 times faster on my machine:
> 
> 
> import std.stdio, std.perf;
> 
> void main() {
> const int REPEATS = 1000,
> LENGTH  = 100000;
> 
> //Changed 0 to 1 in order to prevent bounds errors.
> const int[] removeThese = [1, 42, 999, 7, 1000, 6888, 999, 9992];
> int[] array;
> TickCounter t = new TickCounter();
> 
> t.start;
> for (int i = 0; i < REPEATS; ++i) {
> array.length = LENGTH;
> 
> foreach (n; removeThese) {
> n = array.length - n;  //Pull from the end of the array
> if (n != array.length - 1) {
> int[] tmp = array[n+1..$];
> array[n..$-1] = tmp[].dup;
> }
> array.length = array.length - 1;
> }
> }
> t.stop;
> 
> uint mTime = t.milliseconds;
> writefln("MLysenko: %6d ms for %6d repeats on %6d-length arrays (%.2f ms per
> removal)", mTime, REPEATS, LENGTH, cast(real)mTime / REPEATS);
> 
> t.start;
> for (int i = 0; i < REPEATS; ++i) {
> array.length = LENGTH;
> 
> foreach (n; removeThese)
> {
> n = array.length - n; // Remove from opposite side.
> array = array[0..n] ~ array[n+1..$];
> }
> }
> t.stop;
> 
> uint dTime = t.milliseconds;
> writefln("Deewiant: %6d ms for %6d repeats on %6d-length arrays (%.2f ms per
> removal)", dTime, REPEATS, LENGTH, cast(real)dTime / REPEATS);
> 
> writefln("D / M: %.2f", cast(real)dTime / mTime);
> writefln("M / D: %.2f", cast(real)mTime / dTime);
> }
> 
> 
> In the end, there isn't a single best answer to this problem.  Both approaches
> have their merit, and perhaps an adaptive strategy which combines the two might
> best. (However yours performs far more consistently than mine, which may be a
> big advantage in some cases.)

If I understand correctly, the last items in the array get shifted a 
number of times, which takes unnecessary time.

Imagine a "history diagram" of the array. In it I've put an "R" for the 
removables, and those in between are denoted with single lower case letters:

RaaaaRbbbbbRccRddddddReeeeeeeeeeRfffff

(So all items between the first and second removable are called "a".) 
Now, the history would look like:

RaaaaRbbbbbRccRddddddReeeeeeeeeeRfffff
aaaaRbbbbbRccRddddddReeeeeeeeeeRfffff
aaaabbbbbRccRddddddReeeeeeeeeeRfffff
aaaabbbbbccRddddddReeeeeeeeeeRfffff
aaaabbbbbccddddddReeeeeeeeeeRfffff
aaaabbbbbccddddddeeeeeeeeeeRfffff
aaaabbbbbccddddddeeeeeeeeeefffff

A lot less copying would be done with the following:

RaaaaRbbbbbRccRddddddReeeeeeeeeeRfffff
aaaa RbbbbbRccRddddddReeeeeeeeeeRfffff
aaaabbbbb  RccRddddddReeeeeeeeeeRfffff
aaaabbbbbcc   RddddddReeeeeeeeeeRfffff
aaaabbbbbccdddddd    ReeeeeeeeeeRfffff
aaaabbbbbccddddddeeeeeeeeee     Rfffff
aaaabbbbbccddddddeeeeeeeeeefffff

With this strategy, no element ever gets moved more than once.
Of course, this is only applicable when all the elements-to-be-removed 
are known before the start.

If you have an array of 1M elements, and remove, say a few hundred 
elements from near the beginning, then these two methods have a huge 
performance difference.

(And yes, like someone pointed out, there do exist real-life situations 
where one simply has to use an array for this, as opposed to a linked list.)

PS: I have no objection if you include this reasoning in your excellent 
Tips and Tricks, should you want to.   :-)