Do we need faster regex?
Dmitry Olshansky
dmitry.olsh at gmail.com
Mon Dec 18 18:34:51 UTC 2023
On Monday, 18 December 2023 at 17:16:40 UTC, H. S. Teoh wrote:
> On Sun, Dec 17, 2023 at 03:43:22PM +0000, Dmitry Olshansky via
> Digitalmars-d wrote:
>> So I’ve been working on rewind-regex trying to correct all of
>> the decisions in the original engine that slowed it down,
>> dropping some features that I knew I cannot implement
>> efficiently (backreferences have to go).
>>
>> So while I’m obsessed with simplicity and speed I thought I’d
>> ask people if it was an issue and what they really want from
>> gen2 regex library.
> [...]
>
> What I really want:
>
> - Reduce compile-time cost of `import std.regex;` to zero, or
> at least
> close enough it's no longer noticeable.
>
> - Automatic caching of fixed-string regexes, i.e., the
> equivalent of:
>
> struct Re(string ctKnownRe) {
> Regex!char re;
> shared static this() {
> re = regex(ctKnownRe);
> }
> Regex!char Re() {
> return re;
> }
> }
A runtime cache should work, btw std.regex caches regexes (at
least those passed as strings to match* family of functions).
>
> void main() {
> string s;
> if (s.matchFirst(Re!`some\+pattern`)) {
> ...
> }
>
> // This should reuse the Regex instance from before:
> if (s.matchFirst(Re!`some\+pattern`)) {
> ...
> }
> }
I'm thinking if it's worth it to intern patterns like that.
> - Reasonably fast runtime performance. I don't really care if
> it's the
> top-of-the-line superfast regex matcher, even though that
> would be
> really nice. The primary pain points are the cost of import,
> and the
> need to manually write code for automatic caching of fixed
> runtime
> regexen.
> - Get rid of ctRegex -- it adds a huge compile-time cost with
> questionable runtime benefit. Unless there's a way to do this
> at
> compile-time that *doesn't* add like 5 seconds per regex to
> compile
> times.
Yup it's dropped, to be eventually replaced by JIT which is both
better at compile-time and much more flexible at run-time.
---
Dmitry Olshansky
CEO @ Glowlabs
https://olshansky.me
More information about the Digitalmars-d
mailing list