Lexer and parser generators using CTFE

Wed Feb 29 15:23:22 PST 2012

On Thu, Mar 01, 2012 at 12:04:39AM +0100, Martin Nowak wrote:
[...]
> Mmh, I've retested and you're right dmd's lexer is about 2x faster.
> The main overhead stems from using ranges and enforce.
> 
> Quick profiling shows that 25% is spent in popFront and
> std.utf.stride.  Last time I worked on this I rewrote std.utf.decode
> to be much faster.  But utf characters are still "decoded" twice, once
> for front and then again for popFront. Also stride uses table lookup
> and can't be inlined.
[...]

One way to not decode characters twice is by using a single-dchar buffer
in your range. Something like:

	struct MyRange {
		private File src;
		char buf[];
		dchar readahead;

		this(File _src) {
			// ... fill up buf from src here
			popFront();
		}

		@property pure dchar front() {
			return readahead;
		}

		void popFront() {
			int stride;
			readahead = decode(buf, stride);
			buf = buf[stride..$];
			// ... fill up buf more if needed
		}
	}

T

-- 
"A man's wife has more power over him than the state has." -- Ralph Emerson