Efficiently streaming data to associative array
Steven Schveighoffer via Digitalmars-d-learn
digitalmars-d-learn at puremagic.com
Tue Aug 8 09:00:17 PDT 2017
On 8/8/17 11:28 AM, Guillaume Chatelet wrote:
> Let's say I'm processing MB of data, I'm lazily iterating over the
> incoming lines storing data in an associative array. I don't want to
> copy unless I have to.
>
> Contrived example follows:
>
> input file
> ----------
> a,b,15
> c,d,12
> ....
>
> Efficient ingestion
> -------------------
> void main() {
>
> size_t[string][string] indexed_map;
>
> foreach(char[] line ; stdin.byLine) {
> char[] a;
> char[] b;
> size_t value;
> line.formattedRead!"%s,%s,%d"(a,b,value);
>
> auto pA = a in indexed_map;
> if(pA is null) {
> pA = &(indexed_map[a.idup] = (size_t[string]).init);
> }
>
> auto pB = b in (*pA);
> if(pB is null) {
> pB = &((*pA)[b.idup] = size_t.init
> }
>
> // Technically unneeded but let's say we have more than 2 dimensions.
> (*pB) = value;
> }
>
> indexed_map.writeln;
> }
>
>
> I qualify this code as ugly but fast. Any idea on how to make this less
> ugly? Is there something in Phobos to help?
I wouldn't use formattedRead, as I think this is going to allocate
temporaries for a and b.
Note, this is very close to Jon Degenhardt's blog post in May:
https://dlang.org/blog/2017/05/24/faster-command-line-tools-in-d/
-Steve
More information about the Digitalmars-d-learn
mailing list