Compile-time optimization

Tue Jul 23 16:30:10 PDT 2013

On Tue, Jul 23, 2013 at 10:11:17PM +0200, JS wrote:
[...]
> Note though that something like
> 
> string x = "asd"
> join(x,...); // x is compile time known and should be joined at
> compile time. Not sure if D is smart enough to get this one?

Well, if you want *that*, then you'll just have to submit a DMD pull
request for the optimizer. ;-)

The problem is that even though in this particular case it should be
easy for the compiler to figure out, in the general case it's far from
trivial. For example, you might write:

	string x = "asd";
	x ~= "def";
	join(x, ...);

Arguably, the compiler should detect that the value of x is computable
at compile-time, so it should apply the optimization. But the problem
is, the code between the first line and the call to join() can be
arbitrarily complex... for example:

	enum VeryLargeNumber = 2_391_382_391_483;
	string x = "asd";
	if (canFactorize(VeryLargeNumber))
		x ~= "x";
	join(x, ...);

Assuming canFactorize is CTFE-able, in theory the compiler should be
able to figure out the value of x at compile-time. But integer
factorization currently has no simple solution, which means your code
will take 5 months to compile while DMD tries to solve integer
factorization. Worse yet, you might write something like:

	string x = "asd";
	void function(int x) fn = createArbitraryFunction();
	if (canHalt(fn))
		x ~= "x";
	join(x, ...);

Then your program will *never* finish compiling, because the halting
problem is unsolvable, but nothing stops you from writing a CTFE-able
solution that requires infinite time to find an answer.

> When I get some time later I'll check your code out, what I want to
> do is get something like
> 
> string x = join(a, b, "a", "b"); to produce the same code as
> string x = a~b~"ab";
> 
> (no looping using for each except for string arrays)
[...]

That's pretty much what my code already does.  Just use GDC and it will
unroll the loop for you. ;-)

Alternatively, since tupleReduce returns a tuple, you *could* just
foreach over it (loops involving tuples are automatically unrolled at
compile-time) and generate the code. That is, instead of using
optimizedJoin() as a wrapper for a function, you could do something like
this instead:

	string optimizedJoin(args...)() {
		string result;
		foreach (arg; tupleReduce!(
				(string x, string y) => x~y, args))
		{
			result ~= arg;
		}
		return result;
	}

I haven't tested this code, though, so I'm not 100% sure if it will
work. But if not, a little massaging should make it work.

On another note, though, it makes me wonder why you want this level of
control over the generated code. Are you sure you're not doing premature
optimization? You should be aware that the more you use such arcane
tricks to generate this "optimized" code, the less the compiler's
built-in optimizer will be able to understand your intent, and the more
likely that it may be generating more inferior code than if you had just
written the code the "normal" way.

Generally, this level of manual optimization should only be done after
you've profiled your code and identified the hotspots. I know from
experience that more often than not, what you *think* is a hotspot
actually isn't. It's only after running a profiler that you find out
where the *real* hotspots are. :)  I used to be a hardcore optimization
freak -- used to read assembly output from the compiler to identify
inefficiencies -- until one day I actually ran a profiler on my code.
Turns out that the real hotspot is completely NOT where I expected it.
All those late nights spent "optimizing" my code, only to get negligible
performance gain, when a 1-line change in the *real* hotspot improved
performance by 30% instantly (and only took me 5 minutes instead of
hours and hours).

T

-- 
My program has no bugs! Only undocumented features...