re2d lexer generator
Ulya
skvadrik at gmail.com
Mon Nov 25 16:01:54 UTC 2024
Regular expression compiler [re2c](http://re2c.org) now [supports
D](http://re2c.org/releases/release_notes.html#release-4-0).
A short intro from the official website: *re2c* stands for
*Regular Expressions to Code*. It is a free and open-source lexer
generator that supports C, C++, D, Go, Haskell, Java, JavaScript,
OCaml, Python, Rust, V, Zig, and can be extended to other
languages by implementing a single [syntax
file](http://re2c.org/manual/manual_d.html#syntax-files). The
primary focus of re2c is on generating *fast* code: it compiles
regular expressions to deterministic finite automata and
translates them into direct-coded lexers in the target language
(such lexers are generally faster and easier to debug than their
table-driven analogues). Secondary re2c focus is on
*flexibility*: it does not assume a fixed program template;
instead, it allows the user to embed lexers anywhere in the
source code and configure them to avoid unnecessary buffering and
bounds checks. Internal algorithm used by re2c is based on a
special kind of deterministic finite automata: [lookahead
TDFA](http://re2c.org/2022_borsotti_trofimovich_a_closer_look_at_tdfa.pdf). These automata are as fast as ordinary DFA, but they are also capable of performing submatch extraction with minimal overhead.
There is a [detailed user
guide](http://re2c.org/manual/manual_d.html) an [online
playground](http://re2c.org/playground/?example=d/01_basic.re)
with many examples.
More information about the Digitalmars-d-announce
mailing list