compile-time regex redux
Andrei Alexandrescu (See Website For Email)
SeeWebsiteForEmail at erdani.org
Wed Feb 7 18:54:11 PST 2007
Bill Baxter wrote:
> Andrei Alexandrescu (See Website For Email) wrote:
>> kenny wrote:
>>> Walter, I don't hate regex -- I just don't use it. It seems to me
>>> that to figure out regex syntax takes longer than writing quick
>>> for/while statements, and I usually forget cases in regex too...
>>
>> I think this is an age-old issue: if you don't know something, you
>> find it harder to do things that way. The telling sign is that people
>> who know _both_ simple loops and regexes do use regexes, and as a
>> consequence are way more productive at a certain category of tasks.
>
> Hmm. More productive, probably. Writing better code? Not clear. I
> would guess that in many cases the results are not as easy to maintain
> as non-regexp code.
I don't think the guess is that right. Following the logic of even a
simple parsing task (e.g. floating-point number in all of its splendor)
is horrendous. For somebody who knows regexes, the pattern is obvious in
a second.
I do agree that code written by somebody who knows regexes is
hard-to-maintain by somebody who does not know regexes, but that's
pretty much self-understood and goes with any other technique.
All I can say is that I got significantly enriched and more effective as
a programmer at large after I sat down and understood Perl's regex
bestiary. I now see my previous arguments against them as
rationalizations of my resistance to go through the effort of learning.
Again comparing myself with my former self, I understand it's hard to
discuss relative advantages and disadvantages with someone who doesn't
know them because of a bootstrap problem: I say they make code much
simpler and easier to comprehend, while my former self would say exactly
the opposite. It's pretty much like math notation, eating vegetables, or
classical music: it's hard to bootstrap oneself into appreciating it.
> Anyway, I think the question is whether compile-time regexp is really
> the right level of abstraction to be targeting. Wouldn't it be
> infinitely better to have the compile-time code facilities be so good
> that you could just write a regexp parser as a compile-time D library?
This is possible in today's D. The problem is that it would be a Pyrrhic
victory: the resulting engine would be very slow and big.
I do agree that it would be nice to look into creating compile-time
amenities that make such an engine fast and small.
> I mean what is regexp, but a particular DSL? If the new facilities are
> trying to make DSL's easier to create, regexp is a great target DSL. So
> what compile-time language facilities do you need to implement an
> efficient and clean compile-time regexp library?
Conceptually, you'd need the following: (1) compile-time functions, (2)
compile-time mutable variables, and (3) compile-time loops. We already
have the rest. Then you can write compile-time code as comfortably as
writing run-of-the-mill run-time code. D is heading that way, but with
small steps.
Implementation-wise, string-based templates must be made cheaper. If
we'll have compile-time mutation probably this is not going to be much
of a problem because much functional-style code can be written using
mutation. I personally enjoy functional-style code, but it's not really
needed during compilation and is a bit foreign from the rest of D, which
remains largely imperative.
> It would be nice if we could write more-or-less generic D code with a
> few compile time restrictions. For instance you can write any function
> you want that takes only const values as arguments and returns a const
> value, and refers to only global const values and other such const-only
> functions.
Templates already do that, albeit with a slightly odd syntax. But stay
tuned, Walter is eyeing $ as the prefix to denote compile-time
variables, and sure enough, compile-time functions will then emerge
naturally :o).
Andrei
More information about the Digitalmars-d
mailing list