Thoughts about "Compile-time types" talk

Fri May 10 03:33:07 UTC 2019

On Friday, 10 May 2019 at 00:33:04 UTC, H. S. Teoh wrote:
> Skimmed over Luís Marques' slides on "compile-time types", and 
> felt compelled to say the following in response:  I think we're 
> still approaching things from the wrong level of abstraction.  
> This whole divide between compile-time and runtime is IMO an 
> artificial one, and one that we must transcend in order to 
> break new ground.
>
> I haven't fully thought through this yet, but the basic concept 
> is that there should be *no need* to distinguish between 
> compile-time and runtime in the general case, barring a small 
> number of situations where a decision has to be made.  Ideally, 
> a function should have just a single parameter list -- which 
> Luís has already hit upon -- but I'd go even further and 
> propose that there should be NO distinction made between 
> runtime and compile-time parameters, at least at the 
> declaration level.  It should be the *caller* that decides 
> whether an argument is compile-time or runtime. And that 
> decision in the caller does not have to be made within the 
> caller itself, but could be dictated by *its* caller, and so 
> on. As far as *most* code is concerned, you're just passing 
> arguments from one function to another -- whether it's CT or RT 
> is immaterial. Only at the very top level do you need to make a 
> decision whether something is compile-time or runtime.
>
> The result should be that you can make a single top-level 
> change and a whole cascade of function arguments down the call 
> chain become instantiated at compile-time, or resp. runtime.  
> Most of the code in between *does not have to change* at all; 
> the compiler *infers* whether something is available at CT or 
> has to be deferred to RT, based on the caller's arguments.  
> Some things, of course, like alias parameters and what-not will 
> force CT (though in theory one could implement RT counterparts 
> for them), but that should be *transparent* to the surrounding 
> code.
>
> Just like in OO, changing a class's implementation ought not to 
> require changing every piece of code that uses the class 
> (encapsulation and Liskov substitution principle), so the 
> distinction between CT and RT should be transparent to *most* 
> code.  Only code that actually cares about the difference 
> should need to choose between one or the other. The rest of the 
> code should remain agnostic, and thus remain symmetric under 
> any decision to switch between RT/CT.
>
> For example, we could have first class types in the language: a 
> function can take a type argument and do stuff to it, and 
> depending on whether this type is known at CT, the compiler 
> could resolve it at compile-time or implement it at runtime 
> using the equivalent of typeid() -- and the function *doesn't 
> have to care which*.  You could sort a list of types, use range 
> algorithms on them, etc., and if you call it at compile-time, 
> it will produce a compile-time result; if you call it at 
> runtime, it will produce a runtime result.  Which one it will 
> be doesn't have to be a decision that's arbitrarily decided 
> within the function itself; it can be delegated to higher-level 
> code that has the wider context with which to make the best 
> decision.
>
> Another example, you could have general matrix multiplication 
> code with matrix size as a parameter, and at higher-level 
> decide you only need, say, 4D matrices, so the size can be made 
> a CT argument for one project, allowing the optimizer to 
> generate the best code for the 4D-specific case, but an RT 
> argument for another project that wants to operate on general 
> n*n matrices, without needing to implement matrix 
> multiplication twice. The latter would have the size as an RT 
> parameter, so you trade off optimization for flexibility -- *at 
> the decision of higher-level code*, rather than arbitrarily 
> tossing a coin or guessing what your callers might want.
>
>
> T

I have actually come up with this idea several months before. You 
are basically spot on. The old paradigm of programming needs to 
change. My idea is a big more elaborate though:

The type system is hierarchical and recursive. Essentially all 
programming is based on levels which is essentially a 
parameterized type system on the level.

One works at various levels and uses the same syntax at all 
levels. The compiler the simply tries to "collapse" all the 
levels to the lowest known level. The lowest level is runtime 
which uses unknown inputs from the user and so cannot be 
collapsed any more.

I've been trying to think about how to represent this well in 
programming but haven't worked on it much.

What I'm saying may not make much sense but think about it like 
D's CT and RT... they are just two levels. But we can have higher 
levels that compile code in to CT code for a higher level of 
abstraction. All higher levels of abstraction must be known so 
the code can be reduced to the lowest level, which is RT. RT code 
would be anything that is not known at compile time and hence the 
program must be run to determine the information. Essentially RT 
is just the file compilation step.

Such code though is unified at all levels. Same semantics, same 
syntax, same type system, same everything.

Essentially every statement is a statement that exists at some 
lowest level and that level depends on what the statement "see's" 
and what "see's" it.

My initial idea was to have levels specified by the user as what 
level of abstraction they are working in. The compiler then 
reduces the highest levels of abstraction filling out the inputs 
for the level right below it... which then should allow it to be 
reduced and everything cascades down to the final level of RT, 
which uses things that cannot be reduced. The compiler realizes 
it can only be reduced by "executing" the program. The program 
itself sort of is the compiler then, filling out the rest of the 
information to finally "compile" the entire program.

I've been trying to come up with a universal syntax but haven't 
really thought about it much as it's a long term project. I'm 
trying to get away from C's syntax but my initial conceptual idea 
was to attach levels to statements or blocks. One then just works 
in the level they want and to build code for lower levels.

e.g.,

int x[4] = 34;

is an int at level 4. It is known to all lower levels of code who 
will, if the value didn't change at any higher level, be 34 to 
all the lower levels.

e.g.,

int x[3] = x[4]*3; // compiler can reduce this to a constant in 
level 3 after level 4 has been reduced

The main idea, which we are both hitting on here, is that 
information that is known is known and information that isn't 
requires "running" the program. But really these are, in some 
sense, the same. The runtime information is known, but only by 
the user... as if the user "completes" the program.

e.g.,

readln()

The reason the statement above cannot reduce is because it is not 
known at "compile time"... but that is relatively arbitrary if 
one thinks of runtime as just the "completion of compile time".

If one abstracts out that the entire process, including the user 
input is "compilation" and it's all just reducing to a known 
state(which for user input is delayed).

The idea with the levels above is all the code syntax is the same 
at every level. In D, RT and CT are not unified. CTFE bridges it 
but static if and if are not the same syntax. They distinguish 
between the two levels.

To get all of it to work may be complicated. I'm only in the 
initial investigations of it and I don't plan on putting a lot of 
time in to it. I'm just recording my ideas as I come up with 
them. In any case all I have worked on is creating the TYPE. The 
TYPE is simply any type... and everything is a type. It's so 
general as to include anything(functions, statements, other 
types, etc). It's what unifies everything. The compiler then just 
attempts to reduce any types to "known" values. At the end of 
"compilation" anything that is not known is either an error or 
runtime(the type cannot be reduced without user input).

Typing of course is categorical and one then has to create 
abstract subtyping but it unifies the type system, sorta like 
object does with oop.

I believe though that not just this level of abstraction is 
required but a new IDE that supports the level of abstract along 
with a higher level of text/graphical entry.