Compilation strategy

Paulo Pinto pjmlp at progtools.org
Mon Dec 17 12:34:09 PST 2012


Am 17.12.2012 21:09, schrieb foobar:
> On Monday, 17 December 2012 at 04:49:46 UTC, Michel Fortin wrote:
>> On 2012-12-17 03:18:45 +0000, Walter Bright
>> <newshound2 at digitalmars.com> said:
>>
>>> Whether the file format is text or binary does not make any
>>> fundamental difference.
>>
>> I too expect the difference in performance to be negligible in binary
>> form if you maintain the same structure. But if you're translating it
>> to another format you can improve the structure to make it faster.
>>
>> If the file had a table of contents (TOC) of publicly visible symbols
>> right at the start, you could read that table of content alone to fill
>> symbol tables while lazy-loading symbol definitions from the file only
>> when needed.
>>
>> Often, most of the file beyond the TOC wouldn't be needed at all.
>> Having to parse and construct the syntax tree for the whole file
>> incurs many memory allocations in the compiler, which you could avoid
>> if the file was structured for lazy-loading. With a TOC you have very
>> little to read from disk and very little to allocate in memory and
>> that'll make compilation faster.
>>
>> More importantly, if you use only fully-qualified symbol names in the
>> translated form, then you'll be able to load lazily privately imported
>> modules because they'll only be needed when you need the actual
>> definition of a symbol. (Template instantiation might require loading
>> privately imported modules too.)
>>
>> And then you could structure it so a whole library could fit in one
>> file, putting all the TOCs at the start of the same file so it loads
>> from disk in a single read operation (or a couple of *sequential* reads).
>>
>> I'm not sure of the speedup all this would provide, but I'd hazard a
>> guess that it wouldn't be so negligible when compiling a large project
>> incrementally.
>>
>> Implementing any of this in the current front end would be a *lot* of
>> work however.
>
> Precisely. That is the correct solution and is also how [turbo?] pascal
> units (==libs) where implemented *decades ago*.
>
> I'd like to also emphasize the importance of using a *single*
> encapsulated file. This prevents synchronization hazards that D
> inherited from the broken c/c++ model.

I really miss it, but at least it has been picked up by Go as well.

Still find strange that many C and C++ developers are unaware that we 
have modules since the early 80's.

--
Paulo


More information about the Digitalmars-d mailing list