Principled method of lookup-or-insert in associative arrays?

Bruno Medeiros brunodomedeiros+spam at com.gmail
Thu Dec 2 06:18:32 PST 2010


On 20/11/2010 08:07, Andrei Alexandrescu wrote:
> TDPL has an example that can be reduced as follows:
>
> void main() {
> uint[string] map;
> foreach (line; stdin.byLine()) {
> ++map[line];
> }
> }
>
> byLine reuses its buffer so it exposes it as char[]. Therefore,
> attempting to use map[line] will fail. The program compiled and did the
> wrong thing because of a bug.
>
> The challenge is devising a means to make the program compile and work
> as expected. Looking up an uint[string] with a char[] should work, and
> if the char[] is to be inserted, a copy should be made (via .idup or
> to!string). The rule needs to be well defined and reasonably general.
>
> The effect is something like this:
>
> void main() {
> uint[string] map;
> foreach (line; stdin.byLine()) {
> auto p = line in map;
> if (p) ++*p;
> else map[line.idup] = 1;
> }
> }
>
> Ideally the programmer would write the simple code (first snippet) and
> achieve the semantics of the more complex code (second snippet). Any ideas?
>
> Andrei

An idea:

	import std.conv;
	//...
	uint[string] map;
	foreach (line; stdin.byLine()) {
		map.getOrPut(line, 0, to!string) ++;
	}


Explanation:
getOrPut retrieves entry under key "line", returns a ref value if found. 
If not found, it creates a new entry with value 0 (second parameter), 
and with key to!string(line), and again returns the ref value. So the 
third parameter is a delegate/function that creates a new key from the 
lookup key given as the first parameter. There is a contract in this 
creator function that the created key must equal the lookup key. Thus 
there may be some minor optimization versus the second snippet of the 
OP, since getOrPut doesn't need to calculate hashCode twice.

getOrPut might implicitly know how to convert certain objects, like 
const(char) to string, so
	map.getOrPut(line, 0, to!string) ++;
could be just:
	map.getOrPut(line, 0) ++;


There should also be another signature of getOrPut in which the second 
parameter is delegate/function (a value creator), for situations where 
the entry value is not unexpensive to create. For example:

	map.getOrPut(line, { return new Integer(0); }, to!string) ++;


-- 
Bruno Medeiros - Software Engineer


More information about the Digitalmars-d mailing list