Potential of a compiler that creates the executable at once

H. S. Teoh hsteoh at quickfur.ath.cx
Fri Feb 11 17:36:37 UTC 2022


On Fri, Feb 11, 2022 at 04:47:46PM +0000, user1234 via Digitalmars-d wrote:
> On Friday, 11 February 2022 at 16:41:33 UTC, user1234 wrote:
> > On Friday, 11 February 2022 at 15:17:16 UTC, rempas wrote:
> > > On Friday, 11 February 2022 at 14:52:09 UTC, max haughton wrote:
> > > > 
> > > > The object emission code in the backend is quite inefficient, it
> > > > needs to be rewritten (it's horrible old code anyway)
> > > 
> > > I would love if they would do it but I can't complain that they
> > > don't. Openhub reports that [DMD] consists of 961K LoC!!
> > 
> > Openhub and their metrics are old trash. It's more 170K according to
> > D-Scanner.
> 
> wait... it's 175K. I had not pulled since 8 monthes or so. There's
> much new code that was commited since, with importC notably.

I pulled just this week, and running `wc` on *.d *.c *.h says there are
365K lines.  I'm not sure what the *.h files are for, since DMD is now
bootstrapping. Excluding *.h yields 347K lines.  But a lot of those are
actually blank lines and comments; excluding // comments, /**/ and /++/
block comments, and blank lines yields 175K.

The 961K probably comes from the myriad test cases in the testsuite,
where more lines is actually a *good* thing.

But really, LoC is an unreliable measure of code complexity. Token count
would be more reflective of the actual complexity of the code, though
even that is questionable. Writing `enum x = 1 + 1;` would be 7 tokens
vs. `enum x = 2;` which is 5 tokens, for example, but the former may
actually make code easier to read in certain cases (e.g., if the longer
expression makes intent clearer that the shorter one).

Compressed size may be an even better approximation, because a high
degree of complexity approaches Kolgomorov complexity in the limit,
which is a measure of the information content of the data. Stripping
comments and compressing (with the best compression algorithm you can
find), for example, would give a good approximation to the actual
complexity in the code.  Though of course, even that fails to measure
the inherent level of complexity in language constructs. So you couldn't
meaningfully compare compressed sizes across different languages, for
example.


T

-- 
Unix was not designed to stop people from doing stupid things, because that would also stop them from doing clever things. -- Doug Gwyn


More information about the Digitalmars-d mailing list