DCT: D compiler as a collection of libraries
Jacob Carlborg
doob at me.com
Fri May 11 02:36:26 PDT 2012
On 2012-05-11 11:22, Roman D. Boiko wrote:
>> What about line and column information?
> Indices of the first code unit of each line are stored inside lexer and
> a function will compute Location (line number, column number, file
> specification) for any index. This way size of Token instance is reduced
> to the minimum. It is assumed that Location can be computed on demand,
> and is not needed frequently. So column is calculated by reverse walk
> till previous end of line, etc. Locations will possible to calculate
> both taking into account special token sequences (e.g., #line 3
> "ab/c.d"), or discarding them.
Aha, clever. As long as I can get out the information I'm happy :) How
about adding properties for this in the token struct?
>>>> * Does it convert numerical literals and similar to their actual values
>>> It is planned to add a post-processor for that as part of parser,
>>> please see README.md for some more details.
>>
>> Isn't that a job for the lexer?
> That might be done in lexer for efficiency reasons (to avoid lexing
> token value again). But separating this into a dedicated post-processing
> phase leads to a much cleaner design (IMO), also suitable for uses when
> such values are not needed.
That might be the case. But I don't think it belongs in the parser.
> Also I don't think that performance would be
> improved given the ratio of number of literals to total number of tokens
> and the need to store additional information per token if it is done in
> lexer. I will elaborate on that later.
Ok, fair enough. Perhaps this could be a property in the Token struct as
well. In that case I would suggest renaming "value" to
lexeme/spelling/representation, or something like that, and then name
the new property "value".
--
/Jacob Carlborg
More information about the Digitalmars-d-announce
mailing list