Combsort comparison
bearophile
bearophileHUGS at lycos.com
Sun Jun 20 08:54:40 PDT 2010
I have translated another small Python/C++ program to D, this is performs the combsort:
http://en.wikipedia.org/wiki/Combsort
The original code comes from Wikipedia and the rosettacode site.
This is a Python version, the algorithm is very simple:
def swap(alist, i, j):
alist[i], alist[j] = alist[j], alist[i]
def combsort(a):
gap = len(a)
swaps = True
while gap > 1 or swaps:
gap = max(1, int(gap / 1.2473))
swaps = False
for i in xrange(len(a) - gap):
if a[i] > a[i + gap]:
swap(a, i, i + gap)
swaps = True
a = [88,18,31,44,4,0,8,81,14,78,20,76,84,33,73,75,82,5,62,70]
combsort(a)
assert a == sorted(a)
A translation to D2 is easy and painless, no D/Phobos bugs found while writing it:
import std.algorithm;
void combsort(T)(T[] input) {
int gap = input.length;
bool swaps = true;
while (gap > 1 || swaps) {
gap = max(1, cast(int)(gap / 1.2473));
swaps = false;
foreach (i; 0 .. input.length - gap)
if (input[i] > input[i + gap]) {
swap(input[i], input[i + gap]);
swaps = true;
}
}
}
void main() {
auto a = [88,18,31,44,4,0,8,81,14,78,20,76,84,33,73,75,82,5,62,70];
combsort(a);
assert(a == a.dup.sort);
}
On Wikipedia I have seen a more general C++ version (see the C++ code at the bottom), so I have written a D version that uses ranges, able (in theory) to sort any Forward Range that contains Swappable items, see the D #3 at the bottom.
The translation to more generic D has taken me some time because I am not trained to use ranges, but I have found no real problem in translating the algorithm.
The two main problems found while creating this D #3 are:
1) I have tried to use the combsort to sort a std.container.SList, but you can't perform array() on such list, or even a walkLength (the same is true on a static array as input), so in practice it can't be used by combsort. In my opinion array() and walkLength() must work with anything that can be iterated with a foreach. My bug reports number 4346 and 4114 express this opinion.
2) D2 contract programming doesn't support the "old" (original input data view) yet, so I can't use a postcondition to test if combsort has actually done its work correctly (the result is sorted and contains the same items of the input). To solve this I have used a static nested function that does the actual combsort work, called from an outer function that in debug mode performs the tests. This is not nice.
I don't know if my D #3 code is the best possible using Ranges, maybe there is a way to use them more efficiently, I have just started using D Ranges. In this program the left and right "cursors" keep their relative distance, and in practice the C++ program shows that two "single" iterators (instead of two pairs) can be enough for this program. If you know how to improve the ranges usage in D #3 you can tell me.
I have very often heard that generic C++ programs (like the C++ version of combsort able to work on both linked lists and arrays) are about as efficient as the specialized C ones, I have so far accepted this, but this time I have done some benchmarks.
The C version of the code is adapted from the C code present in Wikipedia, it generates 10 million pseudo-random numbers on the heap and then sorts them. The D #1 version is a direct translation of the C version. The D #2 is a more natural translation to D. The D #3 is the approximated translation of the C++ code.
Timings, seconds, best of 3:
D3: 5.94
D2: 3.13
C++: 3.05
D1: 2.90
C: 2.72
gcc -O3 -Wall comb_c.c -o comb_c
g++ -O3 -Wall comb_cpp.cpp -o comb_cpp
dmd -O -release -inline comb_d1
GCC version 4.5.0, dmd v2.047.
I have not timed the Python version because it's probably slow.
The performance difference between the D #1 and D #2 versions is small and probably not so important.
The performance difference between the C and D #1 versions can probably become about zero if the D code is compiled with LDC.
There is a significant speed difference between D #2 and D #3, almost two times slower (and the random number generation takes constant time, so probably the combsort itself of D #3 is about two times slower).
Even in the C Vs C++ case there is a bit of difference, but it's probably small enough, it can be significant only in special situations.
When writing very generic code in D I will take a look at the performance compared to a more specialized version of the code (often specialized for arrays).
-----------------------------
// C version
#include "stdio.h"
#include "stdlib.h"
//#define DEBUG
void combsort(int items[], int len) {
int swapped = 1;
int gap = len;
while (gap > 1 || swapped) {
if (gap > 1)
gap /= 1.247330950103979;
swapped = 0;
int i, j;
for (i = 0, j = gap; j < len; i++, j++) {
if (items[i] > items[j]) {
int temp = items[i];
items[i] = items[j];
items[j] = temp;
swapped = 1;
}
}
}
}
static inline int myrandom() {
unsigned long const IM = 139968;
unsigned long const IA = 3877;
unsigned long const IC = 29573;
static unsigned long last = 42;
last = (last * IA + IC) % IM;
return last;
}
#ifdef DEBUG
int issorted(int items[], int len) {
if (len < 2)
return 1;
int i;
for (i = 0; i < (len-1); i++)
if (items[i] > items[i+1])
return 0;
return 1;
}
#endif
#define N (10 * 1000 * 1000)
int main() {
int* v = malloc(sizeof(int) * N);
int i;
for (i = 0; i < N; i++)
v[i] = myrandom();
#ifdef DEBUG
for (i = 0; i < N; i++)
printf("%d ", v[i]);
printf("\n");
#endif
combsort(v, N);
#ifdef DEBUG
for (i = 0; i < N; i++)
printf("%d ", v[i]);
printf("\n");
#endif
printf("%d\n", v[N-1]);
#ifdef DEBUG
if (!issorted(v, N))
printf("NOT sorted\n");
#endif
return 0;
}
-----------------------------
// D #1 version
import std.c.stdlib: malloc;
import std.c.stdio: printf;
void combsort(int* items, int len) {
int swapped = 1;
int gap = len;
while (gap > 1 || swapped) {
if (gap > 1)
gap /= 1.247330950103979;
swapped = 0;
int i, j;
for (i = 0, j = gap; j < len; i++, j++) {
if (items[i] > items[j]) {
int temp = items[i];
items[i] = items[j];
items[j] = temp;
swapped = 1;
}
}
}
}
int myrandom() {
enum uint IM = 139968;
enum uint IA = 3877;
enum uint IC = 29573;
static uint last = 42;
last = (last * IA + IC) % IM;
return last;
}
debug {
int issorted(int* items, int len) {
if (len < 2)
return 1;
int i;
for (i = 0; i < (len-1); i++)
if (items[i] > items[i+1])
return 0;
return 1;
}
}
void main() {
enum int N = 10_000_000;
int* v = cast(int*)malloc(int.sizeof * N);
int i;
for (i = 0; i < N; i++)
v[i] = myrandom();
debug {
for (i = 0; i < N; i++)
printf("%d ", v[i]);
printf("\n");
}
combsort(v, N);
debug {
for (i = 0; i < N; i++)
printf("%d ", v[i]);
printf("\n");
}
printf("%d\n", v[N-1]);
debug {
if (!issorted(v, N))
printf("NOT sorted\n");
}
}
-----------------------------
// D #2 version
import std.c.stdio: printf;
import std.c.stdlib: malloc;
import std.algorithm: swap;
void combsort(T)(T[] input) {
int gap = input.length;
bool swaps = true;
while (gap > 1 || swaps) {
if (gap > 1)
gap /= 1.2473;
swaps = false;
foreach (i; 0 .. input.length - gap)
if (input[i] > input[i + gap]) {
swap(input[i], input[i + gap]);
swaps = true;
}
}
}
int myrandom() {
enum uint IM = 139968;
enum uint IA = 3877;
enum uint IC = 29573;
static uint last = 42;
last = (last * IA + IC) % IM;
return last;
}
debug {
bool issorted(int[] items) {
if (items.length < 2)
return true;
foreach (i, el; items[0 .. $-1])
if (el > items[i+1])
return false;
return true;
}
}
void main() {
enum int N = 10_000_000;
int* v = cast(int*)malloc(int.sizeof * N);
int i;
for (i = 0; i < N; i++)
v[i] = myrandom();
debug {
for (i = 0; i < N; i++)
printf("%d ", v[i]);
printf("\n");
}
combsort(v[0 .. N]);
debug {
for (i = 0; i < N; i++)
printf("%d ", v[i]);
printf("\n");
}
printf("%d\n", v[N-1]);
debug {
if (!issorted(v[0 .. N]))
printf("NOT sorted\n");
}
}
-----------------------------
// D #3 version
import std.array: empty, popFront, popFrontN, front;
import std.range: walkLength, isForwardRange, hasSwappableElements;
import std.algorithm: swap, binaryFun;
import std.c.stdlib: malloc;
import std.c.stdio: printf;
debug {
import std.contracts: enforce;
import std.algorithm: sort, equal;
import std.array: array;
}
void combsort(alias less="a < b", Range)(Range data)
if (isForwardRange!Range && hasSwappableElements!Range) {
static void combsort_impl(Range data) {
enum double SHRINK_FACTOR = 1.247330950103979;
int gap = walkLength(data);
bool swapped = true;
while (gap > 1 || swapped) {
if (gap > 1)
gap /= SHRINK_FACTOR;
swapped = false;
auto right = data;
popFrontN(right, gap);
for (auto left = data; !right.empty; left.popFront, right.popFront) {
if (binaryFun!less(right.front, left.front)) {
swap(left.front, right.front);
swapped = true;
}
}
}
}
// postcondition verified in debug mode only.
// This is a workaround, D postconditions don't
// support the "old" (original input data view) yet.
debug {
auto data_copy = array(data);
sort!(less)(data_copy);
combsort_impl(data);
enforce(equal(data, data_copy));
} else {
combsort_impl(data);
}
}
// the following is C-like code to minimize experimental variables
int myrandom() {
enum uint IM = 139968;
enum uint IA = 3877;
enum uint IC = 29573;
static uint last = 42;
last = (last * IA + IC) % IM;
return last;
}
debug {
bool issorted(int[] items) {
if (items.length < 2)
return true;
foreach (i, el; items[0 .. $-1])
if (el > items[i+1])
return false;
return true;
}
}
void main() {
enum int N = 10_000_000;
int* v = cast(int*)malloc(int.sizeof * N);
int i;
for (i = 0; i < N; i++)
v[i] = myrandom();
debug {
for (i = 0; i < N; i++)
printf("%d ", v[i]);
printf("\n");
}
combsort(v[0 .. N]);
debug {
for (i = 0; i < N; i++)
printf("%d ", v[i]);
printf("\n");
}
printf("%d\n", v[N-1]);
debug {
if (!issorted(v[0 .. N]))
printf("NOT sorted\n");
}
}
-----------------------------
// C++ version
#include "stdio.h"
#include "stdlib.h"
#include <algorithm>
//#define DEBUG
template<class ForwardIterator> void combsort(ForwardIterator first, ForwardIterator last) {
static const double shrink_factor = 1.247330950103979;
typedef typename std::iterator_traits<ForwardIterator>::difference_type difference_type;
difference_type gap = std::distance(first, last);
bool swaps = true;
while (gap > 1 || swaps) {
if (gap > 1)
gap = static_cast<difference_type>(gap / shrink_factor);
swaps = false;
ForwardIterator itLeft(first);
ForwardIterator itRight(first);
std::advance(itRight, gap);
for ( ; itRight != last; ++itLeft, ++itRight)
if (*itRight < *itLeft){
std::iter_swap(itLeft, itRight);
swaps = true;
}
}
}
static inline int myrandom() {
unsigned long const IM = 139968;
unsigned long const IA = 3877;
unsigned long const IC = 29573;
static unsigned long last = 42;
last = (last * IA + IC) % IM;
return last;
}
#ifdef DEBUG
int issorted(int items[], int len) {
if (len < 2)
return 1;
int i;
for (i = 0; i < (len-1); i++)
if (items[i] > items[i+1])
return 0;
return 1;
}
#endif
#define N (10 * 1000 * 1000)
int main() {
int* v = (int*)malloc(sizeof(int) * N);
int i;
for (i = 0; i < N; i++)
v[i] = myrandom();
#ifdef DEBUG
for (i = 0; i < N; i++)
printf("%d ", v[i]);
printf("\n");
#endif
combsort(&(v[0]), &(v[N]));
#ifdef DEBUG
for (i = 0; i < N; i++)
printf("%d ", v[i]);
printf("\n");
#endif
printf("%d\n", v[N-1]);
#ifdef DEBUG
if (!issorted(v, N))
printf("NOT sorted\n");
#endif
return 0;
}
-----------------------------
Bye,
bearophile
More information about the Digitalmars-d
mailing list