F*cked by memory corruption after assiging value to associative array

tsbockman thomas.bockman at gmail.com
Thu Jan 28 22:11:40 UTC 2021


On Thursday, 28 January 2021 at 20:17:09 UTC, frame wrote:
> On Thursday, 28 January 2021 at 19:22:16 UTC, tsbockman wrote:
>> It is possible to get things sort of working with on Windows, 
>> anyway.
>
> I'm ok with it as long as the memory is not re-used by the GC. 
> It seems that it can be prevented with addRoot() successfully.

GC.addRoot is not enough by itself. Each GC needs to know about 
every single thread that may own or mutate any pointer to memory 
managed by that particular GC.

If a GC doesn't know, memory may be prematurely freed, and 
therefore wrongly re-used. This is because when it scans memory 
for pointers to find out which memory is still in use, an 
untracked thread may be hiding a pointer on its stack or in 
registers, or it might move a pointer value from a location late 
in the scanning order to a location early in the scanning order 
while the GC is scanning the memory in between, such that the 
pointer value is not in either location *at the time the GC 
checks it*.

You won't be able to test for this problem easily, because it is 
non-deterministic and depends upon the precise timing with which 
each thread is scheduled and memory is synchronized. But, it will 
probably still bite you later.

If you were just manually creating additional threads unknown to 
the GC, you could tell the GC about them with 
core.thread.osthread.thread_attachThis and thread_detachThis. 
But, I don't know if those work right when there are multiple 
disconnected copies of D runtime running at the same time like 
this.

The official solution is to get the GC proxy connected properly 
from each DLL to the EXE. This is still very broken on Windows in 
other ways (again, explained at my link), but it should at least 
prevent the race condition I described above, as well as being 
more efficient than running multiple GCs in parallel.

Alternatively, you can design your APIs so that no pointer to GC 
memory is ever owned or mutated by any thread unknown to that GC. 
(This is the only option when working across language boundaries.)


More information about the Digitalmars-d-learn mailing list