Storing and Searching large text lists

brian via Digitalmars-d-learn digitalmars-d-learn at puremagic.com
Thu Dec 31 16:41:56 PST 2015


I have a large list, B, of string items. For each item in that 
large list, I need to see if it is in the smaller list, A.

I have been using a simple string array for the storage of A

string[] A

and then using foreach to go through all the items of B and check 
they are in A

foreach(string;B)
/* this looks hacky but wasn't working without the !=0 bit )
     if(find(A,string) != 0)
         writeln("Found a line: ", string);

While this works for small datasets, but when either A or B get 
large (A could be up to 150k records, B in the millions) it takes 
quite a while to run.

I'd like to know what is the best way to store lists of text for 
searching? Is there a better container than a simply array? 
Neither A nor B need to be ordered for my purpose, but would 
sorting help the search? Would it help enough to be worth the CPU 
expense?

Regards
B


More information about the Digitalmars-d-learn mailing list