Request for comments: std.d.lexer
Brian Schott
briancschott at gmail.com
Sun Jan 27 02:42:17 PST 2013
On Sunday, 27 January 2013 at 10:17:48 UTC, Philippe Sigaud wrote:
> * Having a range interface is good. Any reason why you made
> byToken a
> class and not a struct? Most (like, 99%) of range in Phobos are
> structs. Do you need reference semantics?
It implements the InputRange interface from std.range so that
users have a choice of using template constraints or the OO model
in their code.
> * Also, is there a way to keep comments? Any code wanting the
> modify
> the code might need them.
> (edit: Ah, I see it: IterationStyle.IncludeComments)
>
> * I'd distinguish between standard comments and documentation
> comments. These are different beasts, to my eyes.
The standard at http://dlang.org/lex.html doesn't differentiate
between them. It's trivial to write a function that checks if a
token starts with "///", "/**", or "/++" while iterating over the
tokens.
> * I see Token has a startIndex member. Any reason not to have a
> endIndex member? Or can and end index always be deduced from
> startIndex and value.length?
That's the idea.
> * How does it fare with non ASCII code?
Everything is templated on the character type, but I haven't done
any testing on UTF-16 or UTF-32. Valgrind still shows functions
from std.uni being called, so at the moment I assume it works.
> * A rough estimate of number of tokens/s would be good (I know
> it'll
> vary). Walter seems to think if a lexer is not able to vomit
> thousands
> of tokens a seconds, then it's not good. On a related note,
> does your
> lexer have any problem with 10k+-lines files?
$ time dscanner --sloc ../phobos/std/datetime.d
14950
real 0m0.319s
user 0m0.313s
sys 0m0.006s
$ time dmd -c ../phobos/std/datetime.d
real 0m0.354s
user 0m0.318s
sys 0m0.036s
Yes, I know that "time" is a terrible benchmarking tool, but
they're fairly close for whatever that's worth.
More information about the Digitalmars-d
mailing list