How should I sort a doubly linked list the D way?

Wed Aug 14 10:39:19 UTC 2019

On Tuesday, 13 August 2019 at 18:28:35 UTC, Ali Çehreli wrote:
> On 08/13/2019 10:33 AM, Mirjam Akkersdijk wrote:
> > On Tuesday, 13 August 2019 at 14:04:45 UTC, Sebastiaan Koppe
> wrote:
>
> >> Convert the nodes into an D array, sort the array with
> nodes.sort!"a.x
> >> < b.x" and then iterate the array and repair the next/prev
> pointers.
>
> If possible, I would go further and ditch the linked list 
> altogether: Just append the nodes to an array and then sort the 
> array. It has been shown in research, conference presentations, 
> and in personal code to be the fasted option is most (or all) 
> cases.
>
> > doesn't the nature of the dynamic array slow it down a bit?
>
> Default bounds checking is going to cost a tiny bit, which you 
> can turn off after development with a compiler flag. (I still 
> wouldn't.)
>
> The only other option that would be faster is an array that's 
> sitting on the stack, created with alloca. But it's only for 
> cases where the thread will not run out of stack space and the 
> result of the array is not going to be used.
>
> > can't I define an array of fixed size, which is dependent on
> the input
> > of the function?
>
>     arr.length = number_of_elements;
>
> All elements will be initialized to the element's default 
> value, which happens to be null for pointers. (If we are back 
> to linked list Node pointers.)
>
> However, I wouldn't bother with setting length either as the 
> cost of automatic array resizing is amortized, meaning that it 
> won't hurt the O(1) algorithmic complexity in the general case. 
> In the GC case that D uses, it will be even better: because if 
> the GC knowns that the neighboring memory block is free, it 
> will just add that to the dynamic array's capacity without 
> moving elements to the new location.
>
> Summary: Ditch the linked list and put the elements into an 
> array. :)
>

There are mainly three reasons why arrays are nowadays faster 
than double linked lists:
- pointer chasing can difficultly be paralized and defeats 
prefetching. Each pointer load may cost the full latency to 
memory (hundreds of cycles). In a multiprocessor machine may also 
trigger a lot of coherency trafic.
- on 64 bit systems 2 pointers cost 16 bytes. If the payload is 
small, there is more memory used in the pointer than in the data.
- when looping in an array the OO machinery will be able to 
parallelize execution beyond loop limits.
- reduced allocation, i.e. allocation is done in bulk => faster 
GC for D.

It is only when there are a lot of external references to the 
payload in the list that using an array may become too unwieldy, 
i.e. if moving an element in memory requires the update of other 
pointers outside of the list.