cashew.utils.array 0.1a.2
Chris Nicholson-Sauls
ibisbasenji at gmail.com
Thu Sep 14 12:32:54 PDT 2006
Mikola Lysenko wrote:
> Chris Nicholson-Sauls wrote:
>
>> /***********************************************************************************
>>
>> * Return an array composed of common elements of two arrays.
>> */
>> T[] intersect (T) (inout T[] alpha, inout T[] beta)
>> body {
>> T[] result;
>>
>> foreach (x; alpha) {
>> if (beta.contains(x)) {
>> result ~= x;
>> }
>> }
>> return result;
>> }
>> unittest {
>> writef("\t.intersect(T[]) ----> ");
>> auto foo = array!(int)(1, 2, 3, 4, 5 );
>> auto bar = array!(int)( 3, 4, 5, 6, 7);
>> assert(foo.intersect(bar) == array!(int)(3, 4, 5));
>> writefln("Pass");
>> }
>>
>> /***********************************************************************************
>>
>> * Remove duplicate values from an array.
>> */
>> void unique (T) (inout T[] haystack)
>> body {
>> size_t ridx ;
>>
>> for (int i = 0; i < haystack.length; i++) {
>> ridx = size_t.max;
>> while ((ridx = haystack.rindexOf(haystack[i], ridx)) != i) {
>> haystack.removeIndex(ridx);
>> }
>> }
>> }
>> unittest {
>> writef("\t.unique() ----------> ");
>> auto foo = array!(int)(1, 1, 2, 1, 2, 3, 1, 2, 3, 4, 1, 2, 3, 4, 5);
>> foo.unique();
>> assert(foo == array!(int)(1, 2, 3, 4, 5));
>> writefln("Pass");
>> }
>>
>
> These algorithms can be implemented more efficiently. The running time
> for the intersect is O(n * m) (where n,m are the length of the two
> arrays). However, what about the case where alpha and beta are ordered?
> Here is a simple algorithm for getting the intersect:
>
> T[] intersect (T) (inout T[] alpha, inout T[] beta)
> body
> {
>
> /* Assume alpha, beta are ordered such that alpha[i] < alpha[i+1]
> * and beta[i] < beta[i+1] for all i.
> */
>
> T[] result;
> size_t a_pos = 0;
> size_t b_pos = 0;
>
> while(a_pos < alpha.length && b_pos < beta.length)
> {
> if(alpha[a_pos] == beta[b_pos])
> {
> result ~= alpha[a_pos];
> ++a_pos;
> ++b_pos;
> }
> else if(alpha[a_pos] < beta[b_pos])
> ++a_pos;
> else if(beta[b_pos] < alpha[a_pos])
> ++b_pos;
> }
>
> return result;
> }
>
>
> The cost for this algorithm is only O(n + m). However, ordering the two
> arrays is effectively an O(nlogn + mlogm) operation. Therefore the
> running time for this modified algorithm is overall O(nlogn + mlogm). A
> similar argument can be made for the unique function.
True, and not a bad idea. I'm not sure how it would be with arrays of classes/structs
which have no defined order. I'll try a rewrite and post it later. :) Maybe I should
just ask for a DSource entry for Cashew.
-- Chris Nicholson-Sauls
More information about the Digitalmars-d-announce
mailing list