DCT: D compiler as a collection of libraries
Roman D. Boiko
rb at d-coding.com
Fri May 11 22:13:31 PDT 2012
On Saturday, 12 May 2012 at 03:32:20 UTC, Ary Manzana wrote:
> As deadalnix says, I think you are over-complicating things.
>
> I mean, to store the column and line information it's just:
>
> if (isNewLine(c)) {
> line++;
> column = 0;
> } else {
> column++;
> }
>
> (I think you need to add that to the SourceRange class. Then
> copy line and column to token on the Lexer#lex() method)
>
> Do you really think it's that costly in terms of performance?
>
> I think you are wasting much more memory and performance by
> storing all the tokens in the lexer.
>
> Imagine I want to implement a simple syntax highlighter: just
> highlight keywords. How can I tell DCT to *not* store all
> tokens because I need each one in turn? And since I'll be
> highlighting in the editor I will need column and line
> information. That means I'll have to do that O(log(n))
> operation for every token.
>
> So you see, for the simplest use case of a lexer the
> performance of DCT is awful.
>
> Now imagine I want to build an AST. Again, I consume the tokens
> one by one, probably peeking in some cases. If I want to store
> line and column information I just copy them to the AST. You
> say the tokens are discarded but their data is not, and that's
> why their data is usually copied.
Summary of your points (I deliberately emphasize some of them
more than you did; please correct me if I misinterpreted
anything):
* storing location for each token is simple and cheap
* SourceRange is needed; an instance per token; it should be a
class, not struct
* Syntax highlighting doesn't need to keep all tokens in memory
* but it must know column and line for each token
* for this use case DCT has awful performance
* to build AST lines and columns are required
* information from tokens must be copied and possibly transformed
in order to put it into AST; after such transformation tokens are
not needed any more
Those all are valid points given that we don't have any
benchmarks and usage examples. The good news is that use cases
like highlighting, parsing, autocompletion, etc. are the core
functionality for which DCT should be designed. So if it fails
any of them, design needs to be revisited.
But I will need some time to thoroughly address each of this
issues (either prove that it is not relevant, or fix the
problem). I will definitely complete this work during May. I will
regularly report my progress here.
In general, I tend to disagree with most of them, because I
already thought about each and designed accordingly, but it is
important to give them enough attention. What I fail miserably at
this moment is providing actual usage code and benchmarks, so I'm
going to focus on that.
Thanks for your feedback!
More information about the Digitalmars-d-announce
mailing list