AST files instead of DI interface files for faster compilation and easier distribution

Dmitry Olshansky dmitry.olsh at gmail.com
Tue Jun 12 12:37:02 PDT 2012


On 12.06.2012 22:47, Adam Wilson wrote:
> On Tue, 12 Jun 2012 05:23:16 -0700, Dmitry Olshansky
> <dmitry.olsh at gmail.com> wrote:
>
>> On 12.06.2012 16:09, foobar wrote:
>>> On Tuesday, 12 June 2012 at 11:09:04 UTC, Don Clugston wrote:
>>>> On 12/06/12 11:07, timotheecour wrote:
>>>>> There's a current pull request to improve di file generation
>>>>> (https://github.com/D-Programming-Language/dmd/pull/945); I'd like to
>>>>> suggest further ideas.
>>>>> As far as I understand, di interface files try to achieve these
>>>>> conflicting goals:
>>>>>
>>>>> 1) speed up compilation by avoiding having to reparse large files over
>>>>> and over.
>>>>> 2) hide implementation details for proprietary reasons
>>>> > 3) still maintain source code in some form to allow inlining
>>>> and CTFE
>>>> > 4) be human readable
>>>>
>>>> Is that actually true? My recollection is that the original motivation
>>>> was only goal (2), but I was fairly new to D at the time (2005).
>>>>
>>>> Here's the original post where it was implemented:
>>>> http://www.digitalmars.com/d/archives/digitalmars/D/29883.html
>>>> and it got partially merged into DMD 0.141 (Dec 4 2005), first usable
>>>> in DMD0.142
>>>>
>>>> Personally I believe that.di files are *totally* the wrong approach
>>>> for goal (1). I don't think goal (1) and (2) have anything in common
>>>> at all with each other, except that C tried to achieve both of them
>>>> using header files. It's an OK solution for (1) in C, it's a failure
>>>> in C++, and a complete failure in D.
>>>>
>>>> IMHO: If we want goal (1), we should try to achieve goal (1), and stop
>>>> pretending its in any way related to goal (2).
>>>
>>> I absolutely agree with the above and would also add that goal (4) is an
>>> anti-feature. In order to get a human readable version of the API the
>>> programmer should use *documentation*. D claims that one of its goals is
>>> to make it a breeze to provide documentation by bundling a standard tool
>>> - DDoc. There's no need to duplicate this just to provide another format
>>> when DDoc itself supposed to be format agnostic.
>>>
>> Absolutely. DDoc being built-in didn't sound right to me at first, BUT
>> it allows us to essentially being able to say that APIs are covered in
>> the DDoc generated files. Not header files etc.
>>
>>> This is a solved problem since the 80's (E.g. Pascal units).
>>
>> Right, seeing yet another newbie hit it everyday is a clear indication
>> of a simple fact: people would like to think & work in modules rather
>> then seeing guts of old and crappy OBJ file technology. Linking with C
>> != using C tools everywhere.
>>
>
> I completely agree with this. The interactions between the D module
> system and D toolchain are utterly confusing to newcomers, especially
> those from other C-like languages. There are better ways, see .NET
> Assemblies and Pascal Units. These problems were solved decades ago. Why
> are we still using 40-year-old paradigms?
>
>> >Per Adam's
>>> post, the issue is tied to DMD's use of OMF/optlink which we all would
>>> like to get rid of anyway. Once we're in proper COFF land, couldn't we
>>> just store the required metadata (binary AST?) in special sections in
>>> the object files themselves?
>>>
>> Seconded. At least lexed form could be very compact, I recall early
>> compressors tried doing the Huffman thing on source code tokens with a
>> certain success.
>>
>
> I don't see the value of compression. Lexing would already reduce the
> size significantly and compression would only add to processing times.
> Disk is cheap.

I/O is not. (De)Compression on the fly is more and more intersecting 
direction these days. The less you read/write the faster you get. 
Knowing beforehand the distribution of keywords relative frequency is a 
boon. Yet I agree that it's premature at the moment.

>
> Beyond that though, this is absolutely the direction D must head in. In
> my mind the DI generation patch was mostly just a stop-gap to bring
> DI-gen up-to-date with the current system thereby giving us enough time
> to tackle the (admittedly huge) task of building COFF into the backend,
> emitting the lexed source into a special section and then giving the
> compiler *AND* linker the ability to read out the source. For example
> the giving the linker the ability to read out source code essentially
> requires a brand-new linker. Although, it is my personal opinion that
> the linker should be integrated with the compiler and done as one step,
> this way the linker could have intimate knowledge of the source and
> would enable some spectacular LTO options. If only DMD were written in
> D, then we could really open the compile speed throttles with an MT
> build model...
>


-- 
Dmitry Olshansky


More information about the Digitalmars-d mailing list