Either I'm confused or the gc is
H. S. Teoh
hsteoh at quickfur.ath.cx
Wed Oct 21 18:55:50 UTC 2020
On Wed, Oct 21, 2020 at 04:24:41PM +0000, donallen via Digitalmars-d wrote:
> I'm new to D, but not to programming (I wrote my first line of code 60
> years ago and am retired from a long career as a developer and project
> manager).
Welcome aboard! ;-)
[...]
> ````
> // Now do the children of this account
> // First determine how many there are
> bind_text(count_children, 1, account.guid);
> int n_children = one_row(count_children, &get_int).integer_value;
> if (n_children > 0)
> {
> Account[] children = new Account[n_children];
> bind_text(find_children, 1, account.guid);
> int i = 0;
> while (next_row_available_p(find_children, &reset_stmt))
> {
> if (i > n_children-1)
> panic("walk_account_tree: child index exceeds
> n_children");
> children[i].name =
> fromStringz(sqlite3_column_text(find_children, 0)).dup;
> children[i].path = format("%s:%s", account.path,
> children[i].name);
> children[i].guid =
> fromStringz(sqlite3_column_text(find_children, 1)).dup; // guid
> children[i].commodity_guid =
> fromStringz(sqlite3_column_text(find_children, 2)).dup; // commodity_guid
> children[i].flags = sqlite3_column_int(find_children,
> 3) | account.flags &
> (account_flag_descendents_are_assets
> | account_flag_descendents_are_liabilities |
> account_flag_descendents_are_income
> | account_flag_descendents_are_expenses |
> account_flag_descendents_are_marketable
> | account_flag_self_and_descendents_are_tax_related
> | account_flag_descendents_need_commodity_link); //
> flags
> i = i + 1;
> }
> foreach (k, ref child; children)
> walk_account_tree(&child, ancestor_flags | account.flags);
> }
> ````
>
> The problem is that at some point, the verifier starts spewing bogus
> error messages about what it is seeing in the tree. Oddly, putting in
> debugging writelns results in the error messages not occurring -- a
> Heisenbug. But working with gdb, I found that the account structs
> after the error messages start are zeroed. Turning on gc profiling
> tells me that 3 gcs have occurred. Disabling the gc results in the
> program running correctly -- no error messages (and I know the
> database has no errors because the C version, which has been around
> for awhile, confirms that the db is well formed).
Without seeing the rest of your code, or a minimal (runnable) failing
test case, it's hard to say for certain what the problem is. But if I
were to guess, it looks like the GC is prematurely collecting your
arrays, though I can't imagine why it would if you still retain
references to it.
One potential gotcha, since you mention that this is being ported from
C, is that if you pass pointers to GC-allocated objects to C code which
stores it somewhere, you need to make sure you retain a reference
somewhere on the D side of things, or else inform the GC of the
reference using core.memory.GC.addRoot.[1] Otherwise, since the GC may
not be aware of where the C code stores the pointer, it may fail to find
it and wrongly believe that the object is dead, and thereby collect it
prematurely.
[1] See: https://dlang.org/library/core/memory/gc.add_root.html
Another potential problem is if your C code (or C-style code ported to
D) obscures pointers, e.g., using the XOR trick to implement a
doubly-linked list with only a single pointer field, the GC will not be
able to discover the reference, and thus may wrongly mark the referenced
object as dead.
Another thing that stuck out to me while glancing over your code
snippet, is that the `children` array doesn't seem to be stored
anywhere; this means it will go out of scope at the end of the scope and
possibly be collected, unless you retain at least one reference to an
array element somewhere. I can't tell if this is important without
knowing the rest of your code, but it's something to look into. (Though
it puzzles me why this would be a problem, since obviously your code
still has a reference to *something* in that array in order for the
verifier code to check it. But it's something to look into if there are
no other clues.)
> I could post a PR, but I'm not sure that this is a bug. It could
> easily be a misunderstanding by me of D and its memory management. So
> I thought I'd try this post first, hoping that one of you who knows
> the language better than I do could point out a problem with my code.
> I do need to resolve this or abandon this project.
[...]
Have you identified what's causing the problem? I.e., what would you put
in your PR?
T
--
He who does not appreciate the beauty of language is not worthy to bemoan its flaws.
More information about the Digitalmars-d
mailing list