Do we need faster regex?

H. S. Teoh hsteoh at qfbox.info
Mon Dec 18 17:16:40 UTC 2023


On Sun, Dec 17, 2023 at 03:43:22PM +0000, Dmitry Olshansky via Digitalmars-d wrote:
> So I’ve been working on rewind-regex trying to correct all of the
> decisions in the original engine that slowed it down, dropping some
> features that I knew I cannot implement efficiently (backreferences
> have to go).
> 
> So while I’m obsessed with simplicity and speed I thought I’d ask
> people if it was an issue and what they really want from gen2 regex
> library.
[...]

What I really want:

- Reduce compile-time cost of `import std.regex;` to zero, or at least
  close enough it's no longer noticeable.

- Automatic caching of fixed-string regexes, i.e., the equivalent of:

	struct Re(string ctKnownRe) {
		Regex!char re;
		shared static this() {
			re = regex(ctKnownRe);
		}
		Regex!char Re() {
			return re;
		}
	}

	void main() {
		string s;
		if (s.matchFirst(Re!`some\+pattern`)) {
			...
		}

		// This should reuse the Regex instance from before:
		if (s.matchFirst(Re!`some\+pattern`)) {
			...
		}
	}

- Reasonably fast runtime performance. I don't really care if it's the
  top-of-the-line superfast regex matcher, even though that would be
  really nice.  The primary pain points are the cost of import, and the
  need to manually write code for automatic caching of fixed runtime
  regexen.

- Get rid of ctRegex -- it adds a huge compile-time cost with
  questionable runtime benefit. Unless there's a way to do this at
  compile-time that *doesn't* add like 5 seconds per regex to compile
  times.


T

-- 
That's not a bug; that's a feature!


More information about the Digitalmars-d mailing list