Want to help DMD bugfixing? Write a simple utility.

Jonathan M Davis jmdavisProg at gmx.com
Wed Mar 23 08:16:46 PDT 2011


> On Sun, 20 Mar 2011 07:50:10 -0000, Jonathan M Davis <jmdavisProg at gmx.com>
> 
> wrote:
> >> Jonathan M Davis wrote:
> >> > On Saturday 19 March 2011 18:04:57 Don wrote:
> >> >> Jonathan M Davis wrote:
> >> >>> On Saturday 19 March 2011 17:11:56 Don wrote:
> >> >>>> Here's the task:
> >> >>>> Given a .d source file, strip out all of the unittest {} blocks,
> >> >>>> including everything inside them.
> >> >>>> Strip out all comments as well.
> >> >>>> Print out the resulting file.
> >> >>>> 
> >> >>>> Motivation: Bug reports frequently come with very large test cases.
> >> >>>> Even ones which look small often import from Phobos.
> >> >>>> Reducing the test case is the first step in fixing the bug, and
> >> 
> >> it's
> >> 
> >> >>>> frequently ~30% of the total time required. Stripping out the unit
> >> >>>> tests is the most time-consuming and error-prone part of reducing
> >> 
> >> the
> >> 
> >> >>>> test case.
> >> >>>> 
> >> >>>> This should be a good task if you're relatively new to D but would
> >> >>>> like to do something really useful.
> >> >>> 
> >> >>> Unfortunately, to do that 100% correctly, you need to actually have
> >> 
> >> a
> >> 
> >> >>> working D lexer (and possibly parser). You might be able to get
> >> >>> something close enough to work in most cases, but it doesn't take
> >> 
> >> all
> >> 
> >> >>> that much to throw off a basic implementation of this sort of thing
> >> 
> >> if
> >> 
> >> >>> you don't lex/parse it with something which properly understands D.
> >> >>> 
> >> >>> - Jonathan M Davis
> >> >> 
> >> >> I didn't say it needs 100% accuracy. You can assume, for example,
> >> 
> >> that
> >> 
> >> >> "unittest" always occurs at the start of a line. The only other
> >> 
> >> things
> >> 
> >> >> you need to lex are {}, string literals, and comments.
> >> >> 
> >> >> BTW, the immediate motivation for this is std.datetime in Phobos. The
> >> >> sheer number of unittests in there is an absolute catastrophe for
> >> >> tracking down bugs. It makes a tool like this MANDATORY.
> >> > 
> >> > I tried to create a similar tool before and gave up because I couldn't
> >> > make it 100% accurate and was running into problems with it. If
> >> 
> >> someone
> >> 
> >> > wants to take a shot at it though, that's fine.
> >> > 
> >> > As for the unit tests in std.datetime making it hard to track down
> >> 
> >> bugs,
> >> 
> >> > that only makes sense to me if you're trying to look at the whole
> >> 
> >> thing
> >> 
> >> > at once and track down a compiler bug which happens _somewhere_ in the
> >> > code, but you don't know where. Other than a problem like that, I
> >> 
> >> don't
> >> 
> >> > really see how the unit tests get in the way of tracking down bugs. Is
> >> > it that you need to compile in a version of std.datetime which doesn't
> >> > have any unit tests compiled in but you still need to compile with
> >> > -unittest for other stuff?
> >> 
> >> No. All you know there's a bug that's being triggered somewhere in
> >> Phobos (with -unittest). It's probably not in std.datetime.
> >> But Phobos is a horrible ball of mud where everything imports everything
> >> else, and std.datetime is near the centre of that ball. What you have to
> >> do is reduce the amount of code, and especially the number of modules,
> >> as rapidly as possible; this means getting rid of imports.
> >> 
> >> To do this, you need to remove large chunks of code from the files. This
> >> is pretty simple; comment out half of the file, if it still works, then
> >> delete it. Normally this works well because typically only about a dozen
> >> lines are actually being used. After doing this about three or four
> >> times it's small enough that you can usually get rid of most of the
> >> imports. Unittests foul this up because they use functions/classes from
> >> inside the file.
> >> 
> >> In the case of std.datetime it's even worse because the signal-to-noise
> >> ratio is so incredibly poor; it's really difficult to find the few lines
> >> of code that are actually being used by other Phobos modules.
> >> 
> >> My experience (obviously only over the last month or so) has been that
> >> if the reduction of a bug is non-obvious, more than 10% of the total
> >> time taken to fix that bug is the time taken to cut down std.datetime.
> > 
> > Hmmm. I really don't know what could be done to fix that (other than
> > making it
> > easier to rip out the unittest blocks). And enough of std.datetime
> > depends on
> > other parts of std.datetime that trimming it down isn't (and can't be)
> > exactly
> > easy. In general, SysTime is the most likely type to be used, and it
> > depends
> > on Date, TimeOfDay, and DateTime, and all 4 of those depend on most of
> > the
> > free functions in the module. It's not exactly designed in a manner which
> > allows you to cut out large chunks and still have it compile. And I don't
> > think that it _could_ be designed that way and still have the
> > functionality
> > that it has.
> > 
> > I guess that this sort of problem is one that would pop up mainly when
> > dealing
> > with compiler bugs. I have a hard time seeing it popping up with your
> > typical
> > bug in Phobos itself. So, I guess that this is the sort of thing that
> > you'd
> > run into and I likely wouldn't.
> > 
> > I really don't know how the situation could be improved though other than
> > making it easier to cut out the unit tests.
> 
> I was just thinking .. if we get a list of the symbols the linker is
> including, then write an app to take that list, and strip everything else
> out of the source .. would that work.  The Q's are how hard is it to get
> the symbols from the linker and then how hard is it to match those to
> source.  IIRC there are functions in phobos to convert to/from symbol
> names, so if the app had sufficient lexing and parsing capability it could
> match on those.

That would require a full-blown D lexer and parser.

- Jonathan M Davis


More information about the Digitalmars-d-learn mailing list