Principled method of lookup-or-insert in associative arrays?

Michel Fortin michel.fortin at michelf.com
Sat Nov 20 20:26:15 PST 2010


On 2010-11-20 18:04:29 -0500, "Tyro[a.c.edwards]" <nospam at home.com> said:

> On 11/20/2010 9:39 PM, Michel Fortin wrote:
>> But didn't you agree with me yesterday in another thread that it's best
>> to make the caller responsible for the idup in cases where you need a
>> string to be immutable? Now you want the reverse for AAs?
>> 
>> Consider this example:
>> 
>> uint[string][100] maps;
>> foreach (line; stdin.byLine()) {
>> foreach (map; maps) {
>> ++map[line];
>> }
>> }
>> 
>> With your proposal, you'd be silently iduping 100 times each line to
>> produce 100 times the same immutable string.
> 
> Maybe I'm not knowledgeable enough to understand but, if I need to do 
> this under the current constraints I would have to write 
> ++map[line.idup]. In effect I end up duping 100 times anyway. Or is 
> there another solution that I'm just not seeing?

One solution is to just idup the line before entering the inner loop.

	uint[string][100] maps;
	foreach (line; stdin.byLine()) {
		string iline = line.idup;
		foreach (map; maps) {
			++map[iline];
		}
	}

The best solution avoid unnecessary allocations in the problem above to 
would be to idup the line only once for the first map that doesn't 
already contain the key. This way if all maps already have the key you 
don't do any idup, but you are still sure that you won't idup the same 
line twice.

	uint[string][100] maps;
	foreach (line; stdin.byLine()) {
		string iline = null;
		foreach (map; maps) {
			uint* value = line in map;
			if (value)
				++(*value);
			else {
				if (iline == null)
					iline = line.idup;
				++map[iline];
			}
		}
	}

The key point is that by making the caller responsible of calling idup, 
the inefficiency appears more clearly and can be acted upon more 
easily, if deemed necessary.

-- 
Michel Fortin
michel.fortin at michelf.com
http://michelf.com/



More information about the Digitalmars-d mailing list