Looking for champion - std.lang.d.lex

Jonathan M Davis jmdavisProg at gmx.com
Fri Nov 19 13:27:43 PST 2010


On Friday 19 November 2010 13:03:53 Bruno Medeiros wrote:
> On 22/10/2010 20:48, Andrei Alexandrescu wrote:
> > On 10/22/10 14:02 CDT, Tomek Sowiński wrote:
> >> Dnia 22-10-2010 o 00:01:21 Walter Bright <newshound2 at digitalmars.com>
> >> 
> >> napisał(a):
> >>> As we all know, tool support is important for D's success. Making
> >>> tools easier to build will help with that.
> >>> 
> >>> To that end, I think we need a lexer for the standard library -
> >>> std.lang.d.lex. It would be helpful in writing color syntax
> >>> highlighting filters, pretty printers, repl, doc generators, static
> >>> analyzers, and even D compilers.
> >>> 
> >>> It should:
> >>> 
> >>> 1. support a range interface for its input, and a range interface for
> >>> its output
> >>> 2. optionally not generate lexical errors, but just try to recover and
> >>> continue
> >>> 3. optionally return comments and ddoc comments as tokens
> >>> 4. the tokens should be a value type, not a reference type
> >>> 5. generally follow along with the C++ one so that they can be
> >>> maintained in tandem
> >>> 
> >>> It can also serve as the basis for creating a javascript
> >>> implementation that can be embedded into web pages for syntax
> >>> highlighting, and eventually an std.lang.d.parse.
> >>> 
> >>> Anyone want to own this?
> >> 
> >> Interesting idea. Here's another: D will soon need bindings for CORBA,
> >> Thrift, etc, so lexers will have to be written all over to grok
> >> interface files. Perhaps a generic tokenizer which can be parametrized
> >> with a lexical grammar would bring more ROI, I got a hunch D's templates
> >> are strong enough to pull this off without any source code generation
> >> ala JavaCC. The books I read on compilers say tokenization is a solved
> >> problem, so the theory part on what a good abstraction should be is
> >> done. What you think?
> > 
> > Yes. IMHO writing a D tokenizer is a wasted effort. We need a tokenizer
> > generator.
> 
> Agreed, of all the things desired for D, a D tokenizer would rank pretty
> low I think.
> 
> Another thing, even though a tokenizer generator would be much more
> desirable, I wonder if it is wise to have that in the standard library?
> It does not seem to be of wide enough interest to be in a standard
> library. (Out of curiosity, how many languages have such a thing in
> their standard library?)

We want to make it easy for tools to be built to work on and deal with D code. 
An IDE, for example, needs to be able to tokenize and parse D code. A program 
like lint needs to be able to tokenize and parse D code. By providing a lexer 
and parser in the standard library, we are making it far easier for such tools 
to be written, and they could be of major benefit to the D community. Sure, the 
average program won't need to lex or parse D, but some will, and making it easy 
to do will make it a lot easier for such programs to be written.

- Jonathan M Davis


More information about the Digitalmars-d mailing list