Combsort comparison

bearophile bearophileHUGS at lycos.com
Sun Jun 20 08:54:40 PDT 2010


I have translated another small Python/C++ program to D, this is performs the combsort:
http://en.wikipedia.org/wiki/Combsort

The original code comes from Wikipedia and the rosettacode site.

This is a Python version, the algorithm is very simple:


def swap(alist, i, j):
    alist[i], alist[j] = alist[j], alist[i]

def combsort(a):
    gap = len(a)
    swaps = True
    while gap > 1 or swaps:
        gap = max(1, int(gap / 1.2473))
        swaps = False
        for i in xrange(len(a) - gap):
            if a[i] > a[i + gap]:
                swap(a, i, i + gap)
                swaps = True

a = [88,18,31,44,4,0,8,81,14,78,20,76,84,33,73,75,82,5,62,70]
combsort(a)
assert a == sorted(a)



A translation to D2 is easy and painless, no D/Phobos bugs found while writing it:

import std.algorithm;
 
void combsort(T)(T[] input) {
    int gap = input.length;
    bool swaps = true;
    while (gap > 1 || swaps) {
        gap = max(1, cast(int)(gap / 1.2473));
        swaps = false;
        foreach (i; 0 .. input.length - gap)
            if (input[i] > input[i + gap]) {
                swap(input[i], input[i + gap]);
                swaps = true;
            }
    }
}
 
void main() {
    auto a = [88,18,31,44,4,0,8,81,14,78,20,76,84,33,73,75,82,5,62,70];
    combsort(a);
    assert(a == a.dup.sort);
}



On Wikipedia I have seen a more general C++ version (see the C++ code at the bottom), so I have written a D version that uses ranges, able (in theory) to sort any Forward Range that contains Swappable items, see the D #3 at the bottom.

The translation to more generic D has taken me some time because I am not trained to use ranges, but I have found no real problem in translating the algorithm.

The two main problems found while creating this D #3 are:

1) I have tried to use the combsort to sort a std.container.SList, but you can't perform array() on such list, or even a walkLength (the same is true on a static array as input), so in practice it can't be used by combsort. In my opinion array() and walkLength() must work with anything that can be iterated with a foreach. My bug reports number 4346 and 4114 express this opinion.

2) D2 contract programming doesn't support the "old" (original input data view) yet, so I can't use a postcondition to test if combsort has actually done its work correctly (the result is sorted and contains the same items of the input). To solve this I have used a static nested function that does the actual combsort work, called from an outer function that in debug mode performs the tests. This is not nice.

I don't know if my D #3 code is the best possible using Ranges, maybe there is a way to use them more efficiently, I have just started using D Ranges. In this program the left and right "cursors" keep their relative distance, and in practice the C++ program shows that two "single" iterators (instead of two pairs) can be enough for this program. If you know how to improve the ranges usage in D #3 you can tell me.


I have very often heard that generic C++ programs (like the C++ version of combsort able to work on both linked lists and arrays) are about as efficient as the specialized C ones, I have so far accepted this, but this time I have done some benchmarks.

The C version of the code is adapted from the C code present in Wikipedia, it generates 10 million pseudo-random numbers on the heap and then sorts them. The D #1 version is a direct translation of the C version. The D #2 is a more natural translation to D. The D #3 is the approximated translation of the C++ code.

Timings, seconds, best of 3:
  D3:   5.94
  D2:   3.13
  C++:  3.05
  D1:   2.90
  C:    2.72

gcc -O3 -Wall comb_c.c -o comb_c
g++ -O3 -Wall comb_cpp.cpp -o comb_cpp
dmd -O -release -inline comb_d1

GCC version 4.5.0, dmd v2.047.

I have not timed the Python version because it's probably slow.

The performance difference between the D #1 and D #2 versions is small and probably not so important.

The performance difference between the C and D #1 versions can probably become about zero if the D code is compiled with LDC.

There is a significant speed difference between D #2 and D #3, almost two times slower (and the random number generation takes constant time, so probably the combsort itself of D #3 is about two times slower).

Even in the C Vs C++ case there is a bit of difference, but it's probably small enough, it can be significant only in special situations.

When writing very generic code in D I will take a look at the performance compared to a more specialized version of the code (often specialized for arrays).

-----------------------------

// C version
#include "stdio.h"
#include "stdlib.h"

//#define DEBUG

void combsort(int items[], int len) {
    int swapped = 1;
    int gap = len;
    while (gap > 1 || swapped) {
        if (gap > 1)
            gap /= 1.247330950103979;
        swapped = 0;
        int i, j;
        for (i = 0, j = gap; j < len; i++, j++) {
            if (items[i] > items[j]) {
                int temp = items[i];
                items[i] = items[j];
                items[j] = temp;
                swapped = 1;
            }
        }
    }
}

static inline int myrandom() {
    unsigned long const IM = 139968;
    unsigned long const IA = 3877;
    unsigned long const IC = 29573;
    static unsigned long last = 42;
    last = (last * IA + IC) % IM;
    return last;
}

#ifdef DEBUG
    int issorted(int items[], int len) {
        if (len < 2)
            return 1;

        int i;
        for (i = 0; i < (len-1); i++)
            if (items[i] > items[i+1])
                return 0;
        return 1;
    }
#endif

#define N (10 * 1000 * 1000)

int main() {
    int* v = malloc(sizeof(int) * N);
    int i;
    for (i = 0; i < N; i++)
        v[i] = myrandom();

#ifdef DEBUG
    for (i = 0; i < N; i++)
        printf("%d ", v[i]);
    printf("\n");
#endif

    combsort(v, N);

#ifdef DEBUG
    for (i = 0; i < N; i++)
        printf("%d ", v[i]);
    printf("\n");
#endif

    printf("%d\n", v[N-1]);

#ifdef DEBUG
    if (!issorted(v, N))
        printf("NOT sorted\n");
#endif

    return 0;
}

-----------------------------

// D #1 version
import std.c.stdlib: malloc;
import std.c.stdio: printf;

void combsort(int* items, int len) {
    int swapped = 1;
    int gap = len;
    while (gap > 1 || swapped) {
        if (gap > 1)
            gap /= 1.247330950103979;
        swapped = 0;
        int i, j;
        for (i = 0, j = gap; j < len; i++, j++) {
            if (items[i] > items[j]) {
                int temp = items[i];
                items[i] = items[j];
                items[j] = temp;
                swapped = 1;
            }
        }
    }
}

int myrandom() {
    enum uint IM = 139968;
    enum uint IA = 3877;
    enum uint IC = 29573;
    static uint last = 42;
    last = (last * IA + IC) % IM;
    return last;
}

debug {
    int issorted(int* items, int len) {
        if (len < 2)
            return 1;

        int i;
        for (i = 0; i < (len-1); i++)
            if (items[i] > items[i+1])
                return 0;
        return 1;
    }
}

void main() {
    enum int N = 10_000_000;
    int* v = cast(int*)malloc(int.sizeof * N);
    int i;
    for (i = 0; i < N; i++)
        v[i] = myrandom();

    debug {
        for (i = 0; i < N; i++)
            printf("%d ", v[i]);
        printf("\n");
    }

    combsort(v, N);

    debug {
        for (i = 0; i < N; i++)
            printf("%d ", v[i]);
        printf("\n");
    }

    printf("%d\n", v[N-1]);

    debug {
        if (!issorted(v, N))
            printf("NOT sorted\n");
    }
}

-----------------------------

// D #2 version
import std.c.stdio: printf;
import std.c.stdlib: malloc;
import std.algorithm: swap;

void combsort(T)(T[] input) {
    int gap = input.length;
    bool swaps = true;
    while (gap > 1 || swaps) {
        if (gap > 1)
            gap /= 1.2473;
        swaps = false;
        foreach (i; 0 .. input.length - gap)
            if (input[i] > input[i + gap]) {
                swap(input[i], input[i + gap]);
                swaps = true;
            }
    }
}

int myrandom() {
    enum uint IM = 139968;
    enum uint IA = 3877;
    enum uint IC = 29573;
    static uint last = 42;
    last = (last * IA + IC) % IM;
    return last;
}

debug {
    bool issorted(int[] items) {
        if (items.length < 2)
            return true;

        foreach (i, el; items[0 .. $-1])
            if (el > items[i+1])
                return false;
        return true;
    }
}

void main() {
    enum int N = 10_000_000;
    int* v = cast(int*)malloc(int.sizeof * N);
    int i;
    for (i = 0; i < N; i++)
        v[i] = myrandom();

    debug {
        for (i = 0; i < N; i++)
            printf("%d ", v[i]);
        printf("\n");
    }

    combsort(v[0 .. N]);

    debug {
        for (i = 0; i < N; i++)
            printf("%d ", v[i]);
        printf("\n");
    }

    printf("%d\n", v[N-1]);

    debug {
        if (!issorted(v[0 .. N]))
            printf("NOT sorted\n");
    }
}

-----------------------------

// D #3 version
import std.array: empty, popFront, popFrontN, front;
import std.range: walkLength, isForwardRange, hasSwappableElements;
import std.algorithm: swap, binaryFun;
import std.c.stdlib: malloc;
import std.c.stdio: printf;
debug {
    import std.contracts: enforce;
    import std.algorithm: sort, equal;
    import std.array: array;
}

void combsort(alias less="a < b", Range)(Range data)
  if (isForwardRange!Range && hasSwappableElements!Range) {

    static void combsort_impl(Range data) {
        enum double SHRINK_FACTOR = 1.247330950103979;
        int gap = walkLength(data);
        bool swapped = true;

        while (gap > 1 || swapped) {
            if (gap > 1)
                gap /= SHRINK_FACTOR;

            swapped = false;
            auto right = data;
            popFrontN(right, gap);

            for (auto left = data; !right.empty; left.popFront, right.popFront) {
                if (binaryFun!less(right.front, left.front)) {
                    swap(left.front, right.front);
                    swapped = true;
                }
            }
        }
    }

    // postcondition verified in debug mode only.
    // This is a workaround, D postconditions don't
    // support the "old" (original input data view) yet.
    debug {
        auto data_copy = array(data);
        sort!(less)(data_copy);
        combsort_impl(data);
        enforce(equal(data, data_copy));
    } else {
        combsort_impl(data);
    }
}

// the following is C-like code to minimize experimental variables

int myrandom() {
    enum uint IM = 139968;
    enum uint IA = 3877;
    enum uint IC = 29573;
    static uint last = 42;
    last = (last * IA + IC) % IM;
    return last;
}

debug {
    bool issorted(int[] items) {
        if (items.length < 2)
            return true;

        foreach (i, el; items[0 .. $-1])
            if (el > items[i+1])
                return false;
        return true;
    }
}

void main() {
    enum int N = 10_000_000;
    int* v = cast(int*)malloc(int.sizeof * N);
    int i;
    for (i = 0; i < N; i++)
        v[i] = myrandom();

    debug {
        for (i = 0; i < N; i++)
            printf("%d ", v[i]);
        printf("\n");
    }

    combsort(v[0 .. N]);

    debug {
        for (i = 0; i < N; i++)
            printf("%d ", v[i]);
        printf("\n");
    }

    printf("%d\n", v[N-1]);

    debug {
        if (!issorted(v[0 .. N]))
            printf("NOT sorted\n");
    }
}

-----------------------------

// C++ version
#include "stdio.h"
#include "stdlib.h"
#include <algorithm>

//#define DEBUG

template<class ForwardIterator> void combsort(ForwardIterator first, ForwardIterator last) {
    static const double shrink_factor = 1.247330950103979;
    typedef typename std::iterator_traits<ForwardIterator>::difference_type difference_type;
    difference_type gap = std::distance(first, last);
    bool swaps = true;

    while (gap > 1 || swaps) {
        if (gap > 1)
            gap = static_cast<difference_type>(gap / shrink_factor);

        swaps = false;
        ForwardIterator itLeft(first);
        ForwardIterator itRight(first);
        std::advance(itRight, gap);

        for ( ; itRight != last; ++itLeft, ++itRight)
            if (*itRight < *itLeft){
                std::iter_swap(itLeft, itRight);
                swaps = true;
            }
    }
}

static inline int myrandom() {
    unsigned long const IM = 139968;
    unsigned long const IA = 3877;
    unsigned long const IC = 29573;
    static unsigned long last = 42;
    last = (last * IA + IC) % IM;
    return last;
}

#ifdef DEBUG
    int issorted(int items[], int len) {
        if (len < 2)
            return 1;

        int i;
        for (i = 0; i < (len-1); i++)
            if (items[i] > items[i+1])
                return 0;
        return 1;
    }
#endif

#define N (10 * 1000 * 1000)

int main() {
    int* v = (int*)malloc(sizeof(int) * N);
    int i;
    for (i = 0; i < N; i++)
        v[i] = myrandom();

#ifdef DEBUG
    for (i = 0; i < N; i++)
        printf("%d ", v[i]);
    printf("\n");
#endif

    combsort(&(v[0]), &(v[N]));

#ifdef DEBUG
    for (i = 0; i < N; i++)
        printf("%d ", v[i]);
    printf("\n");
#endif

    printf("%d\n", v[N-1]);

#ifdef DEBUG
    if (!issorted(v, N))
        printf("NOT sorted\n");
#endif

    return 0;
}

-----------------------------

Bye,
bearophile


More information about the Digitalmars-d mailing list