associative arrays: iteration is finally here

Wed Oct 28 12:34:44 PDT 2009

On Wed, 28 Oct 2009 22:24:46 +0300, Robert Jacques <sandford at jhu.edu>  
wrote:

> On Wed, 28 Oct 2009 15:06:34 -0400, Denis Koroskin <2korden at gmail.com>  
> wrote:
>
>> On Wed, 28 Oct 2009 17:22:00 +0300, Andrei Alexandrescu  
>> <SeeWebsiteForEmail at erdani.org> wrote:
>>
>>> Walter has magically converted his work on T[new] into work on making  
>>> associative arrays true templates defined in druntime and not  
>>> considered very special by the compiler.
>>>
>>
>> Wow, this is outstanding! (I hope it didn't have any negative impact on  
>> compile-time AA capabilities).
>>
>>> This is very exciting because it opens up or simplifies a number of  
>>> possibilities. One is that of implementing true iteration. I actually  
>>> managed to implement last night something that allows you to do:
>>>
>>> int[int] aa = [ 1:1 ];
>>> auto iter = aa.each;
>>> writeln(iter.front.key);
>>> writeln(iter.front.value);
>>>
>>> Two other iterations are possible: by key and by value (in those cases  
>>> iter.front just returns a key or a value).
>>>
>>> One question is, what names should these bear? I am thinking of makign  
>>> opSlice() a universal method of getting the "all" iterator, a default  
>>> that every container must implement.
>>>
>>> For AAs, there would be a "iterate keys" and "iterate values"  
>>> properties or functions. How should they be called?
>>>
>>>
>>> Thanks,
>>>
>>> Andrei
>>
>> If AA is providing a way to iterate over both keys and values (and it's  
>> a default iteration scheme), why should AA provide 2 other iteration  
>> schemes? Can't they be implemented externally (using adaptor ranges)  
>> with the same efficiency?
>>
>> foreach (e; keys(aa)) {
>>      writefln("key: %s", e);
>> }
>>
>> foreach (e; values(aa)) {
>>      writefln("value: %s", e);
>> }
>>
>> I'd also like you to add a few things in an AA interface.
>>
>> First, opIn should not return a pointer to Value, but a pointer to a  
>> pair of Key and Value, if possible (i.e. if this change won't sacrifice  
>> performance).
>> Second, AA.remove method should accept result of opIn operation to  
>> avoid an additional lookup for removal:
>>
>> if (auto value = key in aa) {
>>      aa.remove(key); // an unnecessary lookup
>> }
>>
>> Something like this would be perfect:
>>
>> struct Element(K,V)
>> {
>>      const K key;
>>      V value;
>> }
>>
>> struct AA(K,V)
>> {
>>      //...
>>      ref Element opIn(K key) { /* throws an exception if element is not  
>> found */ }
>
> Not finding an element is a common use case, not an exception. Using  
> exceptions to pass information is bad style, slow and prevents the use  
> of AAs in pure/nothrow functions. Returning a pointer to an element  
> would allow both key and value to be accessed and could be null if no  
> element is found.
>

Ooops, right, I first wrote it to return a pointer but changed to a  
reference in last moment (mixed it up with opIndex for some reason).
AA.remove should accept a pointer, too.

>>      void remove(ref Element elem) { /* removes an element from an AA  
>> */ }
>>      void remove(K key) { remove(key in this); }
>>
>>      AARange!(K,V) opSlice() { /* iterates over both keys and values */  
>> }
>> }
>>
>> Last, I believe foreach loop should automatically call opSlice() on  
>> iteratee. There is currently an inconsistency with built-in types - you  
>> don't have to call [] on them, yet you must call it on all the other  
>> types:
>>
>> // fine if array is T[] or K[V]
>> foreach (i; array) { ... }
>>
>> // opSlice() is explicit and mandatory for user-defined containers  
>> because they are not ranges.
>> foreach (i; container[]) { ... }
>>
>> Thanks!