Whole source-tree statefull preprocessing, notion of a whole program

Boris-Barboris via Digitalmars-d digitalmars-d at puremagic.com
Sat Apr 8 03:11:11 PDT 2017


Hello! It's a bit long one, I guess, but I'd like to have some 
discussion of topic. I'll start with a concrete use case:

For the sake of entertainment, I tried to wrote generic 
configuration management class. I was inspired by oslo_config 
python package that I have to deal with on work. I started with:

# module config

class Config
{
     this(string filename) { ... }
     void save() abstract;
     void load() abstract;
     // Some implementation, json for example
}

abstract class ConfigGroup(string group_name_par)
{
     static immutable group_name = group_name_par;
     protected Config CONF;
     this(Config config) { CONF = config; }

     mixin template ConfigField(T, string opt_name)
	{
		mixin("@property " ~ T.stringof ~ " " ~ opt_name ~
			  "() { return CONF.root[\"" ~ group_name ~ "\"][\"" ~ 
opt_name ~
			  "\"]." ~ json_type(T.stringof) ~ "; }");
		mixin("@property " ~ T.stringof ~ " " ~ opt_name ~
			  "(" ~ T.stringof ~ " value) { return CONF.root[\"" ~ 
group_name
               ~ "\"][\"" ~ opt_name ~ "\"]." ~ 
json_type(T.stringof) ~ "(value); }");
	}
}

# module testconfig

private class TestConfigGroup: ConfigGroup!("testGroup")
{
	this(Config config) { super(config); }

	mixin ConfigField!(string, "some_string_option");
	mixin ConfigField!(double, "some_double_option");
}


... aand I stopped. And here are the blockers I saw:

1). I had to save template parameter group_name_par into 
group_name. Looks like template mixin doesn't support closures. A 
minor inconvenience, I would say, and it's not what I would like 
to talk about.
2). After preprocessing I wish to have fully-typed, safe and fast 
Config class, that contains all the groups I defined for it in 
it's body, and not as references. I don't want pointer lookup 
during runtime to get some field. This is actually quite a 
problem for D:
     2.1). Looks like mixin is the only instrument to extend class 
body. Obvious solution would be a loop that mixins some 
definitions from some compile-time known array, meybe even string 
array. And the pretties of all ways - so that such array will 
contain module names and class names of all ConfigGroup 
derivatives defined in whole program (absolute madman). Said 
array could be appended in compile-time by every derivative of 
ConfigGroup.
     2.2) Sweet dream of 2.1 is met with absence of tools to 
create and manipulate state during preprocessing. For example:

     immutable string[] primordial = [];  // maybe some special 
qualifier instead
                                          // of immutable will be 
better. Even
                                          // better if it shifts 
to immutable
                                          // during run-time
     premixin template (string toAdd) { primordial ~= toAdd; } // 
for example
     mixin template Populate
     {
         foreach (s; primordial)
             mixin("int " ~ s);    // create some int field
     }
     class Populated { mixin Populate; }
     // another module
     premixin("field1")  // evaluated in preprocessor in
     premixin("field2")  // order of definition
     Populated p = new Populated;
     p.field1 = 3;

         By "premixin" I mean that all such operations are 
performed in our special preprocessor stage, that is completed 
before mixins we already have now start to do their jobs.

     2.3) There is strong C ancestry in D. The one regarding 
compilation being performed on translation units (.d source 
files) is, in my opinion, quite devastating. I don't know about 
you guys, but in 2017 I compile programs. I don't care about 
individual object files and linker shenanigans, for me it's the 
whole program that matters, and object files are just the way C 
does it's thing. You definetly must respect it while interfacing 
with it, but that's about it. Correct me if I'm wrong, but 
departure from C's compile process (CLI is not the cause here I 
believe) allowed C# to encorporate "partial" classes, wich are a 
wonderfull concept - class can be extended only volunteeringly 
(like in D, where we need to willingly write mixin to change 
definition), and localized source code changes: when project 
functionality is extended, old code base can sometimes remain 
completely untouched (this is huge for very big projects IMO). I 
will not deny, however, that readability of such code suffers. As 
a counter-argument, relationships between portion of the class 
and other code are usually local, in a way that this class part's 
fields are used by source code in this folder and basically 
nowhere else.
         But what's done is done, I understand. However, I 
believe, preprocessor still has hope for it, and can be 
generalized to whole source tree without throwing old toolchain 
out of the window. In the way that would allow "primordial" 
string array from the example above to be the same for all 
translation units after preprocessing is done.
     2.4) Original configuration management example would also 
require the ability to import definitions cyclically. Module A 
containing ConfigGroupConcrete instantiation imports module B 
where Config is defined, wich will require B to import A in order 
to access ConfigGroupConcrete definition. Yet another stone in 
C's garden, yes. You could, for example, pass whole 
ConfigGroupConcrete body as a string and mixin it there, but then 
you would require to automatically build such string, and at this 
point you're better off with some kind of templating language. 
And templating languages make thing even less readable IMO, while 
simply being a crutches to replace language preprocessors, that 
don't follow industry needs. I do believe such case is out of 
reach until preprocessing is done on whole program united.


To conclude, I'll summarize my questions:
1). Is there a compiled language that is capable of the 
abovementiond tricks, without resorting to external templating 
meta-languages?
2). How deep the rabbit hole goes in terms of complexity of 
preprocessor modifications required? And for DMD in general?
3). How are cyclic module imports handled currently in D?
4). Is there hope that it's possible to do in, say, a year? I 
don't mind trying to implement it myself, but I don't want to 
invest time in thing that is so conceptually out of plane that 
will simply be too destructive for current compiler environment.


More information about the Digitalmars-d mailing list