Generative programming research project

Fri Dec 1 20:26:41 UTC 2017

On Fri, Dec 01, 2017 at 08:10:03PM +0000, Ola Fosheim Grøstad via Digitalmars-d wrote:
> On Friday, 1 December 2017 at 19:43:53 UTC, H. S. Teoh wrote:
> > There's a link to the OptiML spec that, thankfully, makes it clear that
> > this is something primarily focused on machine learning and certain
> > categories of algorithms, rather than a one-size-fits-all miracle cure
> > that's slated to take over the entire programming world.
> 
> Well, not sure where they are heading, but this seems to focus on DSLs:
> 
> http://drops.dagstuhl.de/opus/volltexte/2015/5029/pdf/19.pdf
> 
> «As mentioned in Section 1 and throughout the paper, we argue for a
> rethinking of the role of high-level languages in performance critical code.
> In particular, as our work and the work of others have shown, generative
> abstractions and DSLs can help in achieving high performance by
> programmatically removing indirection, enabling domain-specific
> optimizations, and mapping code to parallel and heterogeneous platforms.»

Removing indirection smells like something specific to languages like
Java where (almost) everything is a class.  In idiomatic D, temporaries
tend to be more stack-based structs, and even with heap data the number
of indirections tend to be less, at least IME.  Generative abstractions
seem similar in concept to D's metaprogramming capabilities.

Domain-specific optimizations is something I've been thinking about for
a long time now.  As hinted to in the article, today's programming
landscape seems to be concentrated around two extremes: high-level
abstractions where you're basically relying on the compiler to do the
optimizations for you, and low-level tweaking where you basically take
over the reins and do it yourself, from ground zero.  The problem is
that a general-purpose optimizer can't possibly anticipate everything
that you might want to accomplish, so it can only implement general
optimizations, limited by the undecidability of the halting problem in
the most general case.  And on the other extreme, taking things into
your own hands has the limitation that you may not be able to do what a
few decades of compiler optimization work has achieved, so the success
rate is lower with a much higher time investment cost.

But if the powerful optimizers of modern compilers can be accessed more
directly from the source code, we may be able to glean at least some of
the best from both worlds.  If we were able to provide domain-specific
hints to the optimizer, it may be able to do its job better than when it
can only guess at your intentions.

As a simple example, suppose we were to implement matrix algebra as a
user-defined type.  There are two approaches: one is to implement each
operation as a separate function, which has the disadvantage that since
the compiler doesn't know matrix algebra, it's unable to refactor matrix
expressions to a more efficient form for execution.  The other is to
implement a full-fledged expression template system, which is basically
taking over the reins and telling the compiler how a matrix expression
ought to be implemented. The problem with that is that your chosen
expression template optimizations may not be optimal for the machine
you're targeting, if you're writing a library and cannot control what
machines users will run it on.  You also cannot easily leverage future
improvements in the optimizer because you've committed to a specific
implementation strategy in the expression template.

But suppose there is a way to tell the compiler certain invariants,
i.e., mathematical identities, that matrix expressions satisfy.  You
could express, for example, the fact that matrix multiplication is
associative and commutative, or that certain expressions always have a
constant (or easily-computed) result.  Then you let the optimizer decide
which of these identities to use when optimizing a matrix expression,
based on its detailed knowledge of the target machine architecture.
Essentially, you're *educating* the optimizer with domain-specific
knowledge, and thereby enabling it to perform optimizations that it may
otherwise not be able to figure out on its own.

If this can be pulled off, it would allow us to have both
domain-specific optimizations *and* a general-purpose programming
framework.

T

-- 
People tell me that I'm paranoid, but they're just out to get me.