Help with an algorithm!

CRAIG DILLABAUGH via Digitalmars-d-learn digitalmars-d-learn at puremagic.com
Thu Jun 15 06:16:24 PDT 2017


On Thursday, 15 June 2017 at 11:48:54 UTC, Ivan Kazmenko wrote:
> On Thursday, 15 June 2017 at 06:06:01 UTC, MGW wrote:
>> There are two arrays of string [] mas1, mas2; Size of each 
>> about 5M lines. By the size they different, but lines in both 
>> match for 95%. It is necessary to find all lines in an array 
>> of mas2 which differ from mas1. The principal criterion - 
>> speed. There are the 8th core processor and it is good to 
>> involve a multithreading.
>
> The approaches which come to mind are:
>
clip
> taking constant time.
>
> Ivan Kazmenko.

As a follow up to this, if your alphabet is reasonably small 
perhaps could run radix sort based on the first few characters to 
split your arrays up into smaller subsets, and then use one of 
Ivan's suggestions within each subset.  Subset searches could be 
easily run in parallel.


More information about the Digitalmars-d-learn mailing list