Compiler scalability. Question inspired by OOM errors seen by Crystal language

Thu Aug 31 10:39:35 PDT 2017

On Thu, Aug 31, 2017 at 04:36:41PM +0000, Stefan Koch via Digitalmars-d wrote:
> On Wednesday, 30 August 2017 at 16:34:13 UTC, H. S. Teoh wrote:
> > 
> > The CTFE problem will be fixed soon, once Stefan Koch finishes his
> > newCTFE engine.
> > 
> > Templates are still an area where a lot of improvement can be made.
> > But Stefan has indicated interest in visiting this area in the
> > compiler after the newCTFE project is done.  So eventually we'll get
> > there.
> > 
> > 
> > T
> 
> Yes that is true.
> However since I've just taken a full-time job I cannot say when it
> will be finished.
> 
> As for templates I have chosen to work around the issue entirely and
> integrate type-manipulation and introspection abilities with ctfe.
> (Which is inherently more scalable then templates)

I think D's conception of templates, which IMO goes beyond C++'s
(conceptually, anyway, the current implementation is not too different
fundamentally), can be implemented in a much better way that makes it
more scalable.

C++ sees templates as code stencils to be instantiated with the given
template arguments, i.e., copy-n-paste the stencil and substitute
template parameters with the given arguments.

D's templates for the most part still retains that, and certainly the
current implementation essentially just follows the C++ model. But IMO
there's an important conceptual difference: D sees template parameters
not so much as parameters for code stencils to be copy-n-pasted, but as
*compile-time* parameters to a function / set of declarations.  I.e.,
they are parameters that are known at compile-time rather than runtime.

The distinction is subtle but IMO important, because it allows us to
break out of the C++ mold of copy-n-paste-this-stencil that is one of
the causes of template scalability problems (pass too many different
template arguments, and you get massive code bloat, one copy per
instantiation).  By treating template arguments as arguments known at
compile-time instead, the compiler can get smarter with when to
copy-n-paste a declaration, and when to just evaluate the template with
the argument without actually instantiating anything (no copying of AST
nodes, etc.).

A common example is a templated linked-list container, where much of the
linked-list pointer manipulation code is actually independent of the
type of the contained data.  The compiler really only needs to generate
distinct copies of the code when the data itself is being manipulated;
all the other code can be shared between template instantiations.

This is just one example off the top of my head; I'm sure there are
plenty of others that we can come up with once we break free of the C++
mold of thinking of templates as "copy-n-paste except substitute X with
Y".  Another example that just occurred to me is the various recursive
templates in std.meta for manipulating AliasSeq's.  There is actually no
need for the compiler to instantiate anything at all, except perhaps for
the final result. By treating the AliasSeq as an in-compiler data type
with compile-time operations performed on it, the compiler ought to be
able to evaluate the template without needing to instantiate anything
past the first level of recursion.

T

-- 
"How are you doing?" "Doing what?"