Reserving/Preallocating associative array?
Daniel Kozak
kozzi11 at gmail.com
Fri Dec 27 10:25:04 PST 2013
On Tuesday, 24 December 2013 at 22:28:21 UTC, Gordon wrote:
> Hello,
>
> I want to load a large text file containing two numeric fields
> into an associative array.
> The file looks like:
> 1 40
> 4 2
> 42 11
> ...
>
> And has 11M lines.
>
> My code looks like this:
> ===
> void main()
> {
> size_t[size_t] unions;
> auto f = File("input.txt");
> foreach ( line ; f.byLine() ) {
> auto fields = line.split();
> size_t i = to!size_t(fields[0]);
> size_t j = to!size_t(fields[1]);
> unions[i] = j; // <-- here be question
> }
> }
> ===
>
> This is just a test code to illustrate my question (though
> general comments are welcomed - I'm new to D).
>
> Commenting out the highlighted line (not populating the hash),
> the program completes in 25 seconds.
> Compiling with the highlighted line, the program takes ~3.5
> minutes.
>
> Is there a way to speed the loading? perhaps reserving memory
> in the hash before populating it? Or another trick?
>
> Many thanks,
> -gordon
using OrderedAA improve speed 3x
https://github.com/Kozzi11/Trash/tree/master/util
import util.orderedaa;
int main(string[] args)
{
import std.stdio, std.conv, std.string, core.memory;
import bylinefast;
GC.disable;
OrderedAA!(size_t, size_t, 1_000_007) unions;
//size_t[size_t] unions;
foreach (line; "input.txt".File.byLineFast) {
line.munch(" \t"); // skip ws
immutable i = line.parse!size_t;
line.munch(" \t"); // skip ws
immutable j = line.parse!size_t;
unions[i] = j;
}
GC.enable;
return 0;
}
More information about the Digitalmars-d
mailing list