Understanding regexes (Was: Re: DMD 0.147 release)

Don Clugston dac at nospam.com.au
Thu Feb 23 00:23:56 PST 2006


Georg Wrede wrote:
> Walter Bright wrote:
>> "Georg Wrede" <georg.wrede at nospam.org> wrote
>>> Walter Bright wrote:
>>>
>>>> If the compiler is to constant fold regular expressions, then it 
>>>> needs to build in to the compiler exactly what would happen if
>>>> the regex code was evaluated at runtime.
>>>
>>> Yes. IMHO in essence, the binary machine code, which the runtime
>>> also would build. What I have a hard time seeing is, how this
>>> differs from building a normal function at compile time?
>>
>> Consider the strlen() function. Compiling a strlen() function and
>> generating machine code for it is a very different thing from the
>> compiler knowing what strlen is and replacing:
>>
>> strlen("abc")
>>
>> with:
>>
>> 3
> 
> Either I'm getting too old for this business, or you're only giving 
> pseudo answers.
> 
> (1) If we were to stop the compiler dead in its tracks, and I compiled 
> the function "manually" and returned it to the compiler, would we still 
> have a problem here?

That would be OK. The issue is that the compiler is a tool for 
converting text to machine code. It has no mechanism for executing the 
machine code.

> (2) {-- and this I've so far avoided to bring up, out of courtesy --},
> if Don can do it with templates, what's so impossible doing it the 
> regular way??

The compiler does have a mechanism for executing the "template language" 
at compile time, which is what my code is using. But, the template 
language (which I'll call Double D (DD) :-) ) is fundamentally different 
to the ordinary D language (eg, it has no variables). Conceivably, a 
compiler could convert a D function into a DD metafunction, provided 
that it doesn't write to any variables except at initialisation, and 
doesn't use any control structures other than "if-else" and "return",
and all of its parameters are compile-time constants. But that would be 
so restricted as to be almost useless.

Of course the compiler itself could have the DD code built into it, but 
DD is a horribly inefficient language, and it would be hideous to 
program from inside the compiler.

What could perhaps be done is to allow functions with all-constant 
parameters to be converted into overloads.

eg we have the DD metafunction

int strlenT!(char [] s)

Then, if we could define some kind of syntax like
const alias int strlen(char [] s) strlenT!(s);

as an overload of strlen, so that if all parameters are compile-time 
constants, then the reference to strlen becomes a template instantiation.

More generally, if the lookup mechanism for functions was changed to be:
If the first n parameters of a functions are all compile-time constants, 
C1, C2, ... with the remainder being variables or constants, V1, V2, ...
try to find a matching template.
eg, given
   func(C1, C2, C3, V1, V2, C4)
the following functions are looked for, in this order:
func!(C1, C2, C3)(V1, V2, C4);
func!(C1, C2)(C3, V1, V2, C4);
func!(C1)(C2, C3, V1, V2, C4);
func(C1, C2, C3, V1, V2, C4);

Note that as soon as a template is found, the search stops.
eg if there is a
func(C1, C2, C3) which doesn't have a (V1, V2, V3) member function, 
compilation will fail even if a function func(p1, p2, p3, p4, p5, p6) 
exists.

This is superficially akin to overloading 'const' parameters in C++, but 
unlike C++ "const" would actually mean "constant" and not just "I'm not 
_supposed_ to change it".



More information about the Digitalmars-d-announce mailing list