Possible bug in associative array implementation (and/or @safe checking)
Aaron D. Trout
atrout at chatham.edu
Thu Aug 16 20:45:35 UTC 2018
On Thursday, 16 August 2018 at 18:56:45 UTC, Steven Schveighoffer
wrote:
> On 8/16/18 2:32 PM, Aaron D. Trout wrote:
>> [...]
>> On Thursday, 16 August 2018 at 17:20:23 UTC, Steven
>> Schveighoffer wrote:
>>>
>>> Yes, this is the effect I would expect.
>>>
>>> D has traditionally simply allowed slicing stack data without
>>> question (even in @safe code), but that will change when
>>> dip1000 is fully realized. It will be allowed, but only when
>>> assigning to scope variables.
>>>
>>
>> Thanks for the quick and knowledgeable reply! I think I
>> understand what's going on, but I'm surprised it is allowed in
>> @safe code since the compiler doesn't allow the following,
>> even in non- at safe code:
>>
>> int[] badSlice()
>> {
>> int[2] buffer;
>> return buffer[];
>> }
>
> It's because it's on the same line. This is a crude "safe"
> feature that is easily duped.
>
> This is allowed to compile:
>
> int[2] buffer;
> auto buf = buffer[];
> return buf;
>
> But add -dip1000 to the dmd options and that fails.
>
> I would warn you that I think dip1000 is too crude to start
> trying to apply it to your project, and may have linker errors
> with Phobos.
>
>> I guess the compiler just isn't (yet!) able to catch that the
>> associative array is storing a slice of expired stack. I'm
>> surprised that the built-in AA implementation *allows* using
>> slices as keys in @safe code without copying the underlying
>> data to the heap first. This is clearly dangerous, but perhaps
>> heap-copying slices defensively would result in an
>> unacceptable performance hit.
>
> I wouldn't put too much stock in having safety in the AA. The
> AA is a very very old piece of the compiler, that pre-dates
> safety checks, and still is a bit of a kludge in terms of type
> and memory safety. If you do find any obvious bugs, it's good
> to report them.
>
>> This issue came up while trying to eliminate unnecessary
>> allocation in my code. In my case, I could set a maximum key
>> length at compile time and switch my key type to a struct
>> wrapping a static array buffer.
>>
>> In hindsight, it was silly for me to think I could eliminate
>> separately allocating the keys when the key type was a
>> variable length array, since the AA must store the keys. That
>> said, a suitable admonition from the compiler here would have
>> been very educational. I look forward to seeing the full
>> inclusion of DIP1000!
>
> In this case, actually, the AA does NOT store the key data, but
> just the reference to the keys. An array slice is a pointer and
> length, and the data is stored elsewhere. The static version,
> however, does store all the key data inside the AA.
>
> That being said, you can potentially avoid more allocation with
> the keys with various tricks, such as pre-allocating all the
> keys and then using the reference.
>
> In other words, eagerly stick the data into an array of arrays:
>
> auto sets = setA.map!(j => setB.filter!(i => i % j ==
> 0).array).array;
>
> and then not worry about duping them. But it all depends on
> your use case.
>
> -Steve
Thanks again for the quick reply! I have a pretty firm grasp on
what a slice is (array + offset). What I had meant by the comment
"the AA must store the keys" was that I had somehow gotten the
(of course totally mistaken!) idea that the AA only ever needed
to *examine* the key rather than actually storing it. If that
were the case, a slice of about-to-be-expired stack would be
perfectly fair game as a key. Am I correct that doing this
*would* be an OK way to avoid unnecessary allocation if we knew
the key already existed (as a heap allocated slice) in the AA and
we simply wanted to modify the associated value? Example code:
--------------------------------------------------------------
immutable(int)[len] toImmutStaticArray(size_t len, R)(R range)
{
import std.algorithm : copy;
int[len] r;
copy(range, r[]);
return r;
}
void main() @safe
{
int[int[]] aa;
immutable(int)[] heapSlice = [0,1];
aa[heapSlice] = 0; // OK, aa stores heap allocated key
{
import std.range : iota;
auto buffer = 2.iota.toImmutStaticArray!2;
auto stackSlice = buffer[];
aa[stackSlice] = 1; // OK yes? only accessing value
}
assert(aa[heapSlice] == 1);
}
--------------------------------------------------------------
Thanks also for the advice about -dip1000 and the state of the
built-in AA implementation. My code base has been changing to
include more AA-heavy data structures, so I think that in the
near future I will need to do some refactoring to make changing
AA implementation easier.
Also, one last question: should this issue be reported as a new
bug? My understanding was that @safe code should not allow
obtaining references to expired stack memory, but perhaps this is
already a known problem? I'm happy to file a new bug report if
that would be helpful!
- Aaron Trout
More information about the Digitalmars-d-learn
mailing list