design question

Andrei Alexandrescu SeeWebsiteForEmail at erdani.org
Fri Apr 3 09:53:18 PDT 2009


I'm toying with a rather new approach to a certain problem, and wanted 
to ask for some feedback before coding with it.

The STL has two functions with a "stable" variant: stable_sort and 
stable_partition. They differ from sort and partition in that they 
preserve the order of equivalent objects. They are more costly, hence 
the default is "unstable".

I started with the same approach in std.algorithm, but before long I 
figured two things:

a) There are many more algorithms that can benefit from stable vs. 
unstable versions: remove, topN, schwartzSort, partialSort, completeSort 
(take a sorted range and an unsorted one and make a sorted range out of 
them), and makeIndex.

b) Aside from stable and unstable, there's a semistable notion: when 
partitioning, the left half stays stable and the right half is unstable.

Clearly encoding stability in the function name isn't faring very well, 
so I made a policy of it:

enum SwapStrategy { stable, semistable, unstable }

and parameterized the appropriate functions with SwapStrategy. A use 
case looks like this:

// even to the left, odd to the right
auto arr = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10];
auto p = partition!("(a & 1) == 0", SwapStrategy.stable)(arr);
assert(arr == [2, 4, 6, 8, 10, 1, 3, 5, 7, 9]);
assert(p == arr[5 .. $]);

Ok, until now there's only been background :o|. The new part starts here.

I think the stability request could be better encoded as a wrapper of 
the range making the request. For example:

auto p = partition!("(a & 1) == 0")(keepStable(arr));

So in order to tell partition you want stability, you just wrap the 
range with a call to keepStable. (The runtime cost is negligible). 
Similarly, say you want a stable sort. You'd say:

sort(keepStable(arr));

instead of:

sort!(SwapStrategy.stable)(arr);

I have the feeling this approach will scale better in the future. Also 
it is less verbose and in sync with the current way of doing searches:

auto f = find(assumeSorted(haystack), needle);

In that case, assumeSorted uses the same means of transmitting 
information about a range into an algorithm.

So... I'd appreciate any thoughts you might have. Thanks.


Andrei



More information about the Digitalmars-d mailing list