std.d.lexer: pre-voting review / discussion

H. S. Teoh hsteoh at quickfur.ath.cx
Thu Sep 12 09:40:28 PDT 2013


On Thu, Sep 12, 2013 at 08:10:18PM +0400, Dmitry Olshansky wrote:
> 12-Sep-2013 19:39, Timon Gehr пишет:
> >On 09/11/2013 08:49 PM, Walter Bright wrote:
> >>4. When naming tokens like .. 'slice', it is giving it a
> >>syntactic/semantic name rather than a token name. This would be
> >>awkward if .. took on new meanings in D. Calling it 'dotdot' would
> >>be clearer.  Ditto for the rest. For example that is done better,
> >>'*' is called 'star', rather than 'dereference'.
> >
> >FWIW, I use Tok!"..". I.e. a "UDL" for specifying kinds of tokens
> >when interfacing with the parser. Some other kinds of tokens get a
> >canonical representation. Eg. Tok!"i" is the kind of identifier
> >tokens, Tok!"0" is the kind of signed integer literal tokens etc.
> 
> I like this.
> Not only this has the benefit of not colliding with keywords. I also
> imagine that it could be incredibly convenient to get back the
> symbolic representation of a token (when token used as parameter to
> AST-node say BinaryExpr!(Tok!"+")). And truth be told we all know
> how tokens look in symbolic form so learning a pack of names for
> them feels pointless.

+1.  This is superior to both the ad hoc _ suffix and my ad hoc
prefixing approach.  Tok!"default" is maximally readable, and requires
no silly convolutions like _ or 'kw' / 'tokenType' prefixes.

I vote for Tok!"..." to denote token types.

Question: what's the implementation of Tok? Does it fit into an enum?
What's the underlying representation? I imagine some kind of canonical
mapping into an integral type would be desired, to maximize runtime
performance.


T

-- 
There are three kinds of people in the world: those who can count, and those who can't.


More information about the Digitalmars-d mailing list