Compile-time optimization

H. S. Teoh hsteoh at quickfur.ath.cx
Thu Jul 25 10:26:32 PDT 2013


On Thu, Jul 25, 2013 at 07:57:30AM -0700, H. S. Teoh wrote:
> On Thu, Jul 25, 2013 at 03:47:23PM +0200, JS wrote:
[...]
> > Allow for array reduction and delim/indexing. When using join we
> > typically want to join an array of strings while also inserting a
> > delimiter between them... usually with the last one removed.
[...]

Sigh... I could never resist a challenge. Here's a full, working,
compilable, runnable implementation:

	import std.stdio;
	
	template tuple(args...) {
		alias tuple = args;
	}
	
	/**
	 * Given a binary function and a tuple, returns a tuple in which all adjacent
	 * compile-time readable items are recursively substituted with the return
	 * value of the binary function applied to them.
	 */
	template tupleReduce(alias func, args...)
	{
		static if (args.length > 1)
		{
			static if (__traits(compiles, { enum x = args[0]; }))
			{
				static if (__traits(compiles, { enum x = args[1]; }) &&
					is(typeof(func(args[0], args[1]))))
				{
					enum result = func(args[0], args[1]);
					alias tupleReduce = tupleReduce!(func, result,
						args[2..$]);
				}
				else
				{
					alias tupleReduce = tuple!(args[0], args[1],
						tupleReduce!(func, args[2..$]));
				}
			}
			else
			{
				alias tupleReduce = tuple!(args[0],
					tupleReduce!(func, args[1..$]));
			}
		}
		else
		{
			alias tupleReduce = args;
		}
	}
	
	auto reduceImpl(alias fun, T...)(T args)
	{
		import std.functional;
		typeof(unaryFun!fun(args[0],args[0])) result;
	
		foreach (arg; args) {
			result = unaryFun!fun(result, arg);
		}
		return result;
	}
	
	template reduce(alias fun, args...)
	{
		string reduce() {
			return reduceImpl!fun(tupleReduce!(fun, args));
		}
	}
	
	template join(args...)
	{
		string join() {
			return reduce!((string x, string y) => x~y, args);
		}
	}
	
	template insertDelim(string delim, args...) {
		static if (args.length > 1)
			alias insertDelim = tuple!(args[0], delim,
				insertDelim!(delim, args[1..$]));
		else
			alias insertDelim = args;
	}
	
	template joinWithDelim(string delim, args...)
	{
		alias joinWithDelim = join!(insertDelim!(delim, args));
	}
	
	void joinDelimTest() {
		// Example of joinWithDelim
		string x = "runtime1";
		string y = "runtime2";
		string z = "runtime3";
	
		writeln(joinWithDelim!(":", "a", "b", x, "c", y, z, "d"));
	
		// Here's the proof that the delimiters have been concatenated with
		// adjacent items at compile-time where possible:
		writeln([ tupleReduce!((string x, string y) => x~y,
				insertDelim!(":", "a", "b", x, "c", y, z, "d")) ]);
	}
	
	void main() {
		joinDelimTest();
	}


The key points of note are the joinWithDelim template, which implements
a fully-working join with specifiable delimiter, and the code in
joinDelimTest() that illustrates how to use it and what it does. The
program produces this output:

	a:b:runtime1:c:runtime2:runtime3:d
	["a:b:", "runtime1", ":c:", "runtime2", ":", "runtime3", ":d"]

The last line indicates that the delimiters have been concatenated with
all adjacent compile-time readable arguments, where possible. So
basically, we start with:

	"a" "b" x "c" y z "d"

then delimiters are inserted:

	"a" ":" "b" ":" x ":" "c" ":" y ":" z ":" "d"

then this is optimized at compile-time into:

	"a:b:" x ":c:" y ":" z ":d"

By capturing this stage in an array, we get to see what the optimized
tuple looks like (2nd line of output), before joinWithDelim, at runtime,
finally interpolates x, y, and z into the result (1st line of output).

Looking at the disassembly of the program, compiled with gdc -O3, the
call to joinWithDelim is inside its own function (reasonable, since it's
rather large), in which there are 7 array concatenations, as expected.
(If the delimiters weren't optimized at compile-time, we'd have 13
concatenations instead.) That is, it's as if we wrote:

	"a:b" ~ x ~ ":c:" ~ y ~ ":" ~ z ~ ":d"

(The first concatenation happens on the empty array of the return value;
that's why there are 7 rather than 6.)

Exercise for the reader: ;-)

	Large numbers of array concatenations are not efficient at
	runtime.  Ideally, if the number of elements to be concatenated
	is > N, for some threshold N, we should use std.array.appender
	instead of the built-in ~.

	Rewrite the code to do this switchover to std.array.appender
	given some value of N, say 5.

	(Hint: since the length of tuples is conveniently available at
	compile-time as .length, we can just check if .length is greater
	than 5, and if so, instead of using ~ we call a different
	template function that declares a string appender, then use .put
	on it in a foreach over the tuple -- remember that foreach over
	tuples are expanded at compile-time.)

	(Unfortunately, this means that we can't use reduce!() anymore,
	since appender can only be used at runtime; at compile-time we
	still have to use ~. So we can't use the same lambda at both
	compile-time and runtime.)


T

-- 
Why is it that all of the instruments seeking intelligent life in the
universe are pointed away from Earth? -- Michael Beibl


More information about the Digitalmars-d mailing list