Tricky DMD bug, but I have no idea how to report

H. S. Teoh hsteoh at quickfur.ath.cx
Tue Dec 18 22:56:19 UTC 2018


On Tue, Dec 18, 2018 at 10:29:07PM +0000, JN via Digitalmars-d-learn wrote:
> On Monday, 17 December 2018 at 22:22:05 UTC, H. S. Teoh wrote:
> > A less likely possibility might be an optimizer bug -- do you get
> > different results if you add / remove '-O' (and/or '-inline') from
> > your dmd command-line?  If some combination of -O and -inline (or
> > their removal thereof) "fixes" the problem, it could be an optimizer
> > bug. But those are rare, and usually only show up when you use an
> > obscure D feature combined with another obscure corner case, in a
> > way that people haven't thought of.  My bet is still on a pointer
> > bug somewhere in your code.
> > 
> 
> I played around with dmd commandline. It works with -O. Works with -O
> -inline. As soon as I add -boundscheck=off it breaks.
> 
> As I understand it, out of bounds access is UB. Which would fit my
> problems because they look like UB. But if I run without
> boundscheck=off, shouldn't I get a RangeError somewhere?

In theory, yes.  But I wonder if there's some corner case where some
combination of -O or -inline may cause a bounds check to be elided, but
still hit UB. Perhaps the optimizer skipped a bounds check even though
it shouldn't have.  What about compiling with -boundscheck=off but
without -O -inline?  Does that make a difference?

Barring that, it might be one of those really evil pointer bugs where
the problem has already happened far away from the site where the
symptoms first appear, usually an undetected memory corruption that only
shows up as invalid data long after the actual corruption happened. Very
hard to trace.

Are you sure you didn't accidentally do something like escape a pointer
to a local variable, or a slice of a local static array that has since
gone out of scope?  Because that's what your symptoms most closely
resemble.  The last time I ran into this in my own D code, it was caused
by D's really evil implicit conversion of static arrays to slices, where
passing a local static array implicitly passes a slice instead, e.g.:

	SomeObject persistentStorage;

	auto someFunc(int[] data)
	{
		... // stuff
		persistentStorage.insert(data); // retains reference to data
		...
	}

	void buggyCode()
	{
		int[16] arr = ...;
		...
		someFunc(arr);	// <--- implicit conversion happens here
		...
		// uh oh, arr is going out of scope, but
		// persistentStorage holds a reference to it
	}

	void main()
	{
		...
		buggyCode(); // escaped reference to local variable
		...

		// Crash when it tries to access the slice to
		// out-of-scope data:
		doSomething(persistentStorage);
		...
	}

Since no explicit slicing was done, there was no compiler error /
warning of any sort, and it wasn't obvious from the code what had
happened. By the time doSomething() was called, it was already long past
the source of the problem in buggyCode(), and it was almost impossible
to trace the problem back to its source.

Theoretically, -dip25 and -dip1000 are supposed to prevent this sort of
problem, but I don't know how fully-implemented they are, whether they
would catch the specific instance in your code, or whether your code
even compiles with these options.


T

-- 
There's light at the end of the tunnel. It's the oncoming train.


More information about the Digitalmars-d-learn mailing list