Tokenizing D at compile time?

Rainer Schuetze r.sagitario at gmx.de
Fri Aug 26 00:46:01 PDT 2011



On 26.08.2011 03:08, dsimcha wrote:
> I'm working on a parallel array ops implementation for
> std.parallel_algorithm. (For the latest work in progress see
> https://github.com/dsimcha/parallel_algorithm/blob/master/parallel_algorithm.d
> ).
>
> To make it (somewhat) pretty, I need to be able to tokenize a single
> statement worth of D source code at compile time. Right now, the syntax
> requires manual tokenization:
>
> mixin(parallelArrayOp(
> "lhs[]", "=", "op1[]", "*", "op2[]", "/", "op3[]"
> ));
>
> where lhs, op1, op2, op3 are arrays.
>
> I'd like it to be something like:
>
> mixin(parallelArrayOp(
> "lhs[] = op1[] * op2[] / op3[]"
> ));
>
> Does anyone have/is there any easy way to write a compile-time D tokenizer?

The lexer used by Visual D is also CTFE capable:

http://www.dsource.org/projects/visuald/browser/trunk/vdc/lexer.d

As Timon pointed out, it will separate into D tokens, not the more 
combined elements in your array.

Here's my small CTFE test:

///////////////////////////////////////////////////////////////////////
int[] ctfeLexer(string s)
{
	Lexer lex;
	int state;
	uint pos;
	
	int[] ids;
	while(pos < s.length)
	{
		uint prevpos = pos;
		int id;
		int type = lex.scan(state, s, pos, id);
		assert(prevpos < pos);
		if(!Lexer.isCommentOrSpace(type, s[prevpos .. pos]))
			ids ~= id;
	}
	return ids;
}

unittest
{
	static assert(ctfeLexer(q{int /* comment to skip */ a;}) ==
		[ TOK_int, TOK_Identifier, TOK_semicolon ]);
}

If you want the tokens as strings rather than just the token ID, you can 
collect "s[prevpos .. pos]" instead of "id" into an array.


More information about the Digitalmars-d mailing list