Possible bug in associative array implementation (and/or @safe checking)

Aaron D. Trout atrout at chatham.edu
Thu Aug 16 20:45:35 UTC 2018


On Thursday, 16 August 2018 at 18:56:45 UTC, Steven Schveighoffer 
wrote:
> On 8/16/18 2:32 PM, Aaron D. Trout wrote:
>> [...]
>> On Thursday, 16 August 2018 at 17:20:23 UTC, Steven 
>> Schveighoffer wrote:
>>>
>>> Yes, this is the effect I would expect.
>>>
>>> D has traditionally simply allowed slicing stack data without 
>>> question (even in @safe code), but that will change when 
>>> dip1000 is fully realized. It will be allowed, but only when 
>>> assigning to scope variables.
>>>
>> 
>> Thanks for the quick and knowledgeable reply! I think I 
>> understand what's going on, but I'm surprised it is allowed in 
>> @safe code since the compiler doesn't allow the following, 
>> even in non- at safe code:
>> 
>> int[] badSlice()
>> {
>>      int[2] buffer;
>>      return buffer[];
>> }
>
> It's because it's on the same line. This is a crude "safe" 
> feature that is easily duped.
>
> This is allowed to compile:
>
> int[2] buffer;
> auto buf = buffer[];
> return buf;
>
> But add -dip1000 to the dmd options and that fails.
>
> I would warn you that I think dip1000 is too crude to start 
> trying to apply it to your project, and may have linker errors 
> with Phobos.
>
>> I guess the compiler just isn't (yet!) able to catch that the 
>> associative array is storing a slice of expired stack. I'm 
>> surprised that the built-in AA implementation *allows* using 
>> slices as keys in @safe code without copying the underlying 
>> data to the heap first. This is clearly dangerous, but perhaps 
>> heap-copying slices defensively would result in an 
>> unacceptable performance hit.
>
> I wouldn't put too much stock in having safety in the AA. The 
> AA is a very very old piece of the compiler, that pre-dates 
> safety checks, and still is a bit of a kludge in terms of type 
> and memory safety. If you do find any obvious bugs, it's good 
> to report them.
>
>> This issue came up while trying to eliminate unnecessary 
>> allocation in my code. In my case, I could set a maximum key 
>> length at compile time and switch my key type to a struct 
>> wrapping a static array buffer.
>> 
>> In hindsight, it was silly for me to think I could eliminate 
>> separately allocating the keys when the key type was a 
>> variable length array, since the AA must store the keys. That 
>> said, a suitable admonition from the compiler here would have 
>> been very educational. I look forward to seeing the full 
>> inclusion of DIP1000!
>
> In this case, actually, the AA does NOT store the key data, but 
> just the reference to the keys. An array slice is a pointer and 
> length, and the data is stored elsewhere. The static version, 
> however, does store all the key data inside the AA.
>
> That being said, you can potentially avoid more allocation with 
> the keys with various tricks, such as pre-allocating all the 
> keys and then using the reference.
>
> In other words, eagerly stick the data into an array of arrays:
>
>     auto sets = setA.map!(j => setB.filter!(i => i % j == 
> 0).array).array;
>
> and then not worry about duping them. But it all depends on 
> your use case.
>
> -Steve

Thanks again for the quick reply! I have a pretty firm grasp on 
what a slice is (array + offset). What I had meant by the comment 
"the AA must store the keys" was that I had somehow gotten the 
(of course totally mistaken!) idea that the AA only ever needed 
to *examine* the key rather than actually storing it. If that 
were the case, a slice of about-to-be-expired stack would be 
perfectly fair game as a key. Am I correct that doing this 
*would* be an OK way to avoid unnecessary allocation if we knew 
the key already existed (as a heap allocated slice) in the AA and 
we simply wanted to modify the associated value? Example code:

--------------------------------------------------------------

immutable(int)[len] toImmutStaticArray(size_t len, R)(R range)
{
     import std.algorithm : copy;
     int[len] r;
     copy(range, r[]);
     return r;
}

void main() @safe
{
     int[int[]] aa;
     immutable(int)[] heapSlice = [0,1];
     aa[heapSlice] = 0;  // OK, aa stores heap allocated key

     {
         import std.range : iota;
         auto buffer = 2.iota.toImmutStaticArray!2;
         auto stackSlice = buffer[];
         aa[stackSlice] = 1; // OK yes? only accessing value
     }

     assert(aa[heapSlice] == 1);
}

--------------------------------------------------------------

Thanks also for the advice about -dip1000 and the state of the 
built-in AA implementation. My code base has been changing to 
include more AA-heavy data structures, so I think that in the 
near future I will need to do some refactoring to make changing 
AA implementation easier.

Also, one last question: should this issue be reported as a new 
bug? My understanding was that @safe code should not allow 
obtaining references to expired stack memory, but perhaps this is 
already a known problem? I'm happy to file a new bug report if 
that would be helpful!

- Aaron Trout


More information about the Digitalmars-d-learn mailing list