How should I sort a doubly linked list the D way?

Mirjam Akkersdijk noreply at noreply.com
Tue Aug 13 22:12:23 UTC 2019


On Tuesday, 13 August 2019 at 18:28:35 UTC, Ali Çehreli wrote:
> > can't I define an array of fixed size, which is dependent on
> the input
> > of the function?
>
>     arr.length = number_of_elements;
>
> All elements will be initialized to the element's default 
> value, which happens to be null for pointers. (If we are back 
> to linked list Node pointers.)
>
> However, I wouldn't bother with setting length either as the 
> cost of automatic array resizing is amortized, meaning that it 
> won't hurt the O(1) algorithmic complexity in the general case. 
> In the GC case that D uses, it will be even better: because if 
> the GC knowns that the neighboring memory block is free, it 
> will just add that to the dynamic array's capacity without 
> moving elements to the new location.

For the application I am writing, it makes very much sense to use 
a DLL. The approach I am using right now is to: (as stated in the 
first post)

void bsort(DLL* list) {
   Node*[] nodes;

   foreach (i; 0 .. list.size)
     nodes ~= get_node(list, i);

   nodes.sort!("a.x < b.x");
   // (unrelated data structure cleaning up goes here)
}

My concern was that using a dynamic array, continuously appending 
to it despite known size, would greatly impact overall 
performance. My goal is to make this operation as fast as 
possible, the D way.

I tried setting the length first and iterating through it with 
`nodes[i] = ...` and the implementation did not run noticeably 
faster (10 trials), per Teoh's advice to test out the difference.
For now, I consider this a success and would like to thank 
everyone in the thread for assistance and valuable insights.

Though, it left me with some semi-offtopic questions unanswered:

(1) Ali, do you mean that from an optimization viewpoint, it's 
better to keep appending like `nodes ~= ...` instead of setting 
the length first? I would like to have some more background on 
that claim.

(2) I watched the talk and read the paper [1] to be sure. Many 
contributors to this thread claim (C++) vectors are nearly always 
the preferred choice compared to DLLs, but how come they are much 
faster compared to DLLs, given they have nearly the same 
functionality?

(3) And I sense this is very closely related to what Ali is on?

[1] http://www.stroustrup.com/Software-for-infrastructure.pdf


More information about the Digitalmars-d-learn mailing list