The ABC's of Templates in D

H. S. Teoh hsteoh at quickfur.ath.cx
Fri Jul 31 17:57:58 UTC 2020


On Fri, Jul 31, 2020 at 01:46:43PM +0000, Mike Parker via Digitalmars-d-announce wrote:
[...]
> If you've got a code base that uses templates in interesting ways,
> please get in touch! We do offer a bounty for guest posts, so you can
> help with a bit of PR and make a bit of cash at the same time.

Not sure how blog-worthy this is, but recently I was writing a utility
that used std.regex extensively, and I wanted to globally initialize all
regexes (for performance), but I didn't want to use ctRegex because of
onerous compile-time overhead.  So my initial solution was to create a
global struct `Re`, that declared all regexes as static fields and used
a static ctor to initialize them upon startup. Something like this:

	struct Re {
		static Regex!char pattern1;
		static Regex!char pattern2;
		... // etc.

		static this() {
			pattern1 = regex(`foo(\w+)bar`);
			pattern2 = regex(`...`);
			... // etc.
		}
	}

	auto myFunc(string input) {
		...
    		auto result = input.replaceAll(Re.pattern1, `blah $1 bleh`);
		...
	}

This worked, but was ugly because (1) there's too much boilerplate to
declare each regex and individually initialize them in the static ctor;
(2) the definition of each regex was far removed from its usage context,
so things like capture indices were hard to read (you had to look at two
places in the file at the same time to see the correspondence, like the
$1 in the above snippet).

Eventually, I came up with this little trick:

	Regex!char staticRe(string reStr)()
	{
	    static struct Impl
	    {
		static Regex!char re;
		static this()
		{
		    re = regex(reStr);
		}
	    }
	    return Impl.re;
	}

	auto myFunc(string input) {
		...
    		auto result = input.replaceAll(staticRe!"foo(\w+)bar", `blah $1 bleh`);
		...
	}

This allowed the regex definition to be right where it's used, making
things like capture indices immediately obvious in the surrounding code.

Points of interest:

1) staticRe is a template function that takes its argument as a
   compile-time parameter, but at runtime, it simply returns a
   globally-initialized regex (so runtime overhead is basically nil at
   the caller's site, if the compiler inlines the call).

2) The regex is not initialized by ctRegex in order to avoid the
   compile-time overhead; instead, it's initialized at program startup
   time.

3) Basically, this is equivalent to a global variable initialized by a
   module static ctor, but since we can't inject global variables into
   module scope from a template function, we instead declare a wrapper
   struct inside the template function (which ensures a unique
   instantiation -- which also sidesteps the issue of generating unique
   global variable names at compile-time), with a static field that
   basically behaves like a global variable.  To ensure startup
   initialization, we use a struct static ctor, which essentially gets
   concatenated to the list of module-static ctors that are run before
   main() at runtime.

Well, OK, strictly speaking the regex is re-created per thread because
it's in TLS. But since this is a single-threaded utility, it's Good
Enough(tm). (I didn't want to deal with `shared` or __gshared issues
since I don't strictly need it. But in theory you could do that if you
needed to.)

//

Here's a related trick using the same principles that I posted a while
ago: a D equivalent of gettext that automatically extracts translatable
strings. Basically, something like this:

	class Language { ... }
	Language curLang = ...;

	version(extractStrings) {
		private int[string] translatableStrings;
		string[] getTranslatableStrings() {
			return translatableStrings.keys;
		}
	}

	string gettext(string str)() {
		version(extractStrings) {
			static struct StrInjector {
				static this() {
					translatableStrings[str]++;
				}
			}
		}
		return curLang.translate(str);
	}

	...
	auto myFunc() {
		...
		writeln(gettext!"Some translatable message");
		...
	}

The gettext function uses a static struct to inject a static ctor into
the program that inserts all translatable strings into a global AA.
Then, when compiled with -version=extractStrings, this will expose the
function getTranslatableStrings that returns a list of all translatable
strings.  Voila! No need for a separate utility to parse source code to
discover translatable strings; this does it for you automatically. :-)

It could be made more fancy, of course, like having a function that
parses the current l10n files and doing a diff between strings that got
deleted / added / changed, and generating a report to inform the
translator which strings need to be updated.  This is guaranteed to be
100% reliable since the extracted strings are obtained directly from
actual calls to gettext, rather than a 3rd party parser that may choke
over uncommon syntax / unexpected formatting.

D is just *this* awesome.


T

-- 
Computerese Irregular Verb Conjugation: I have preferences.  You have
biases.  He/She has prejudices. -- Gene Wirchenko


More information about the Digitalmars-d-announce mailing list