Looking for champion - std.lang.d.lex

Bruno Medeiros brunodomedeiros+spam at com.gmail
Fri Nov 19 13:03:53 PST 2010


On 22/10/2010 20:48, Andrei Alexandrescu wrote:
> On 10/22/10 14:02 CDT, Tomek Sowiński wrote:
>> Dnia 22-10-2010 o 00:01:21 Walter Bright <newshound2 at digitalmars.com>
>> napisał(a):
>>
>>> As we all know, tool support is important for D's success. Making
>>> tools easier to build will help with that.
>>>
>>> To that end, I think we need a lexer for the standard library -
>>> std.lang.d.lex. It would be helpful in writing color syntax
>>> highlighting filters, pretty printers, repl, doc generators, static
>>> analyzers, and even D compilers.
>>>
>>> It should:
>>>
>>> 1. support a range interface for its input, and a range interface for
>>> its output
>>> 2. optionally not generate lexical errors, but just try to recover and
>>> continue
>>> 3. optionally return comments and ddoc comments as tokens
>>> 4. the tokens should be a value type, not a reference type
>>> 5. generally follow along with the C++ one so that they can be
>>> maintained in tandem
>>>
>>> It can also serve as the basis for creating a javascript
>>> implementation that can be embedded into web pages for syntax
>>> highlighting, and eventually an std.lang.d.parse.
>>>
>>> Anyone want to own this?
>>
>> Interesting idea. Here's another: D will soon need bindings for CORBA,
>> Thrift, etc, so lexers will have to be written all over to grok
>> interface files. Perhaps a generic tokenizer which can be parametrized
>> with a lexical grammar would bring more ROI, I got a hunch D's templates
>> are strong enough to pull this off without any source code generation
>> ala JavaCC. The books I read on compilers say tokenization is a solved
>> problem, so the theory part on what a good abstraction should be is
>> done. What you think?
>
> Yes. IMHO writing a D tokenizer is a wasted effort. We need a tokenizer
> generator.
>

Agreed, of all the things desired for D, a D tokenizer would rank pretty 
low I think.

Another thing, even though a tokenizer generator would be much more 
desirable, I wonder if it is wise to have that in the standard library? 
It does not seem to be of wide enough interest to be in a standard 
library. (Out of curiosity, how many languages have such a thing in 
their standard library?)


-- 
Bruno Medeiros - Software Engineer


More information about the Digitalmars-d mailing list