Looking for champion - std.lang.d.lex

Fri Nov 19 13:53:12 PST 2010

On 19/11/2010 21:27, Jonathan M Davis wrote:
> On Friday 19 November 2010 13:03:53 Bruno Medeiros wrote:
>> On 22/10/2010 20:48, Andrei Alexandrescu wrote:
>>> On 10/22/10 14:02 CDT, Tomek Sowiński wrote:
>>>> Dnia 22-10-2010 o 00:01:21 Walter Bright<newshound2 at digitalmars.com>
>>>>
>>>> napisał(a):
>>>>> As we all know, tool support is important for D's success. Making
>>>>> tools easier to build will help with that.
>>>>>
>>>>> To that end, I think we need a lexer for the standard library -
>>>>> std.lang.d.lex. It would be helpful in writing color syntax
>>>>> highlighting filters, pretty printers, repl, doc generators, static
>>>>> analyzers, and even D compilers.
>>>>>
>>>>> It should:
>>>>>
>>>>> 1. support a range interface for its input, and a range interface for
>>>>> its output
>>>>> 2. optionally not generate lexical errors, but just try to recover and
>>>>> continue
>>>>> 3. optionally return comments and ddoc comments as tokens
>>>>> 4. the tokens should be a value type, not a reference type
>>>>> 5. generally follow along with the C++ one so that they can be
>>>>> maintained in tandem
>>>>>
>>>>> It can also serve as the basis for creating a javascript
>>>>> implementation that can be embedded into web pages for syntax
>>>>> highlighting, and eventually an std.lang.d.parse.
>>>>>
>>>>> Anyone want to own this?
>>>>
>>>> Interesting idea. Here's another: D will soon need bindings for CORBA,
>>>> Thrift, etc, so lexers will have to be written all over to grok
>>>> interface files. Perhaps a generic tokenizer which can be parametrized
>>>> with a lexical grammar would bring more ROI, I got a hunch D's templates
>>>> are strong enough to pull this off without any source code generation
>>>> ala JavaCC. The books I read on compilers say tokenization is a solved
>>>> problem, so the theory part on what a good abstraction should be is
>>>> done. What you think?
>>>
>>> Yes. IMHO writing a D tokenizer is a wasted effort. We need a tokenizer
>>> generator.
>>
>> Agreed, of all the things desired for D, a D tokenizer would rank pretty
>> low I think.
>>
>> Another thing, even though a tokenizer generator would be much more
>> desirable, I wonder if it is wise to have that in the standard library?
>> It does not seem to be of wide enough interest to be in a standard
>> library. (Out of curiosity, how many languages have such a thing in
>> their standard library?)
>
> We want to make it easy for tools to be built to work on and deal with D code.
> An IDE, for example, needs to be able to tokenize and parse D code. A program
> like lint needs to be able to tokenize and parse D code. By providing a lexer
> and parser in the standard library, we are making it far easier for such tools
> to be written, and they could be of major benefit to the D community. Sure, the
> average program won't need to lex or parse D, but some will, and making it easy
> to do will make it a lot easier for such programs to be written.
>
> - Jonathan M Davis

And by providing a lexer and a parser outside the standard library, 
wouldn't it make it just as easy for those tools to be written? What's 
the advantage of being in the standard library? I see only 
disadvantages: to begin with it potentially increases the time that 
Walter or other Phobos contributors may have to spend on it, even if 
it's just reviewing patches or making sure the code works.

-- 
Bruno Medeiros - Software Engineer