idiomatic D: what to use instead of pointers in constructing a tree data structure?

H. S. Teoh via Digitalmars-d-learn digitalmars-d-learn at puremagic.com
Wed Jan 14 11:34:27 PST 2015


On Wed, Jan 14, 2015 at 07:05:16PM +0000, Laeeth Isharc via Digitalmars-d-learn wrote:
> On Tuesday, 13 January 2015 at 17:41:53 UTC, Tobias Pankrath wrote:
> >On Tuesday, 13 January 2015 at 17:19:42 UTC, Laeeth Isharc wrote:
> >>
> >>The GC is allowed to move structs around, as I undestand it.  Under
> >>what circumstances do I get into trouble having a pointer to them?
> >
> >None, a GC that moves structs around must update every pointer
> >afterwards and as far as I know, the standard GC doesn't do that
> >(moving things around).
> >
> >You may not have a pointer inside a struct that points to the struct
> >itself. This allows the compiler to elide some copies.
> 
> Got it - thanks.

Having a pointer inside a struct that points to itself can also cause
unexpected and sometimes very hard to find bugs. Example from real code
where this bug occurred:

	struct S {
		// Resources we wish to cleanup
		Resource1 r1;
		Resource2 r2;
		...

		// List of cleanup routines we'd like to run when the
		// struct is destroyed.
		void delegate()[] cleanups;

		this(...) {
			// Acquire a resource
			r1 = acquireResource1();

			// Clean it up once we're done.
			// N.B.: the delegate implicitly closes over
			// 'this' -- this is what gets us in trouble
			// later.
			cleanups ~= { freeResource1(this.r1); };

			r2 = acquireResource2();
			cleanups ~= { freeResource2(this.r2); };

			...
		}

		~this() {
			// Cleanup resources in reverse order we
			// acquired them.
			foreach_reverse(dg; cleanups) {
				dg();
			}
		}
	}

	/* This function works without any problems. */
	void func1() {
		auto s = S(...); // create an instance of S
		doStuff();
		// Here, S gets cleaned up, and S's dtor releases the
		// resources correctly. No problem.
	}

	/* The following pair of functions, however, have problems... */
	S makeS() {
		// This is just a convenience function for creating an
		// instance of S. Useful for factoring out any messy
		// implementation details that may be needed from the
		// body of func2().
		return S(...);
	}
	void func2() {
		// Make an instance of S, as before.
		auto s = makeS();

		// Do some stuff
		doStuff();

		// And now we are done, so s's dtor should cleanup.
		// ... except, it crashes instead. Why??
	}

func1() runs perfectly normally, no problem. However, func2() crashes
upon exit. Why?

The reason is that the delegates created by S.this() close over the
'this' pointer to the new instance of S. But since S is a struct, this
pointer is actually a pointer to a local variable on the stack of the
caller. This is not a problem in func1() because this local variable
resides in the same scope as the body of func1(), so the cleanup
delegates get invoked before the local variable goes out of scope.
However, in makeS(), we are returning a newly-created instance of S. The
compiler actually implements this by creating a temporary local variable
to hold the new instance of S, which means the 'this' pointer closed
over by the delegates points to this temporary local variable, and then
when the new instance of S is returned to func2(), since S is a by-value
type, it gets bitwise copied into *another* local variable, that is, 's'
in func2(). So the instance of S that func2() sees is actually a
different copy of S than the one created by makeS(). The copy created by
makeS() has now gone out of scope; however, the delegates registered in
s.cleanups have not been adjusted, so they are still closing over the
address of the original copy of S in makeS().

Now when func2() finishes and prepares to return, s's dtor gets invoked,
and it calls the cleanup delegates which attempt to access the instance
of S at the original address, which is now invalid. The result is that
they see garbage data instead of the real values of .r1 and .r2, and it
crashes the program. (If you're lucky, that is. If you're not so lucky,
it won't crash, but will corrupt memory and/or get into an inconsistent
state which causes malfunctions later on. Good luck tracing down the
problem then!)

Moral of the story: don't have struct fields that point to the struct
itself. This is almost always a bad idea. Structs have value semantics,
and the implicit copying around will almost certainly break any
self-referencing pointers, which leads to dangling pointers, good
friends of memory corruption, et al. :-P


T

-- 
Trying to define yourself is like trying to bite your own teeth. -- Alan Watts


More information about the Digitalmars-d-learn mailing list