Are AAs broken? (Was: Why isn't bool allowed as AA key type?)

Steven Schveighoffer schveiguy at yahoo.com
Thu Jun 2 04:15:25 PDT 2011


On Thu, 02 Jun 2011 06:44:33 -0400, Timon Gehr <timon.gehr at gmx.ch> wrote:

>> On Thu, 02 Jun 2011 06:21:28 -0400, Timon Gehr <timon.gehr at gmx.ch>  
>> wrote:
>>
>>> Steven Schveighoffer wrote:
>>>> There is always dcollections (has both Hash- and TreeSet).  It is  
>>>> boost
>>>> 1.0, so feel free to steal anything to propose for std.container (in
>>>> fact,
>>>> RedBlackTree is from dcollections).  Note that I'm nowhere near an
>>>> expert
>>>> on hashing, so I'm not sure how it will perform against AAs.  I know  
>>>> it
>>>> does beat them using my custom allocator, but that's not a truly fair
>>>> comparison -- AA's could be written with a custom allocator too.
>>>>
>>>> -Steve
>>>
>>> Well, how? Since an 'in' expression returns an internal pointer into  
>>> the
>>> AA for
>>> value types, they cannot do better than to rely on GC.
>>> This rules out heavy AA use for real-time applications. I think AAs are
>>> broken.
>>
>> What is the issue here?  Are you saying you can use the pointer after
>> destroying the element, because you can't.  AA's free an element when it
>> is removed.
>
> Huh?
>
> import std.stdio,core.memory;
>
> int* test(){
>     int[int] aa=[0:0,1:1,2:2,3:3,4:4,5:5];
>     return 1 in aa;
> }
>
> void main(){
>     int* u=test();
>     GC.collect();
>     assert(*u==1);
>     /*use u for fun and profit*/
> }

That is not a removal:

int[int] aa=[0:0,1:1,2:2,3:3,4:4,5:5];
auto ret = 1 in aa;
aa.remove(1);
return ret; // ret now points to free'd memory

Your equivalent code using dcollections' custom allocator would work too,  
the GC backs up the custom allocator.

>>
>> BTW, dcollections' custom allocator does use the GC, it just uses it in
>> bulk instead of allocating for each node.
>>
>> -Steve
>
> If AAs would do that, the whole bulk could be pinned by one pointer to  
> one element
> inside it, which is still not optimal.

No, but this is a compromise -- better performance vs. smaller footprint.   
It would be nice if the GC just performed better.

> Returning internal pointers into a data
> structure is always a bad idea (assuming it must be safe).

Then ranges on a container are not possible (a range contains a pointer to  
the next element).  Ranges can be invalidated using a "modification"  
count, but this removes some valid uses, and it also costs quite a bit  
(extra int to store range's copy of count, extra check on every range  
function to ensure counts are synchronized, possible extra pointer to  
store in range to point at owner container).

-Steve


More information about the Digitalmars-d mailing list