Writing a really fast lexer

Fri Dec 11 19:49:12 UTC 2020

For a project with good performance, I would need to be able to 
analyse text. To do so, I would write a parser by hand using the 
recursive descent algorithm, based on a stream of tokens. I 
started writing a lexer with the d-lex package 
(https://code.dlang.org/packages/d-lex), it works really well, 
unfortunately, it's quite slow for the number of lines I'm aiming 
to analyse (I did a test, for a million lines, it lasted about 3 
minutes). As the parser will only have to manipulate tokens, I 
think that the performance of the lexer will be more important to 
consider. Therefore, I wonder what resources there are, in D, for 
writing an efficient lexer. I could of course write it by hand, 
but I would be interested to know what already exists, so as not 
to reinvent the wheel. Of course, if the standard library (or its 
derivatives, such as mir) has features that could be of interest 
to me for this lexer, I am interested. Thanks to you :)