Looking for champion - std.lang.d.lex

Andrei Alexandrescu SeeWebsiteForEmail at erdani.org
Fri Nov 19 15:39:53 PST 2010


On 11/19/10 1:03 PM, Bruno Medeiros wrote:
> On 22/10/2010 20:48, Andrei Alexandrescu wrote:
>> On 10/22/10 14:02 CDT, Tomek Sowiński wrote:
>>> Dnia 22-10-2010 o 00:01:21 Walter Bright <newshound2 at digitalmars.com>
>>> napisał(a):
>>>
>>>> As we all know, tool support is important for D's success. Making
>>>> tools easier to build will help with that.
>>>>
>>>> To that end, I think we need a lexer for the standard library -
>>>> std.lang.d.lex. It would be helpful in writing color syntax
>>>> highlighting filters, pretty printers, repl, doc generators, static
>>>> analyzers, and even D compilers.
>>>>
>>>> It should:
>>>>
>>>> 1. support a range interface for its input, and a range interface for
>>>> its output
>>>> 2. optionally not generate lexical errors, but just try to recover and
>>>> continue
>>>> 3. optionally return comments and ddoc comments as tokens
>>>> 4. the tokens should be a value type, not a reference type
>>>> 5. generally follow along with the C++ one so that they can be
>>>> maintained in tandem
>>>>
>>>> It can also serve as the basis for creating a javascript
>>>> implementation that can be embedded into web pages for syntax
>>>> highlighting, and eventually an std.lang.d.parse.
>>>>
>>>> Anyone want to own this?
>>>
>>> Interesting idea. Here's another: D will soon need bindings for CORBA,
>>> Thrift, etc, so lexers will have to be written all over to grok
>>> interface files. Perhaps a generic tokenizer which can be parametrized
>>> with a lexical grammar would bring more ROI, I got a hunch D's templates
>>> are strong enough to pull this off without any source code generation
>>> ala JavaCC. The books I read on compilers say tokenization is a solved
>>> problem, so the theory part on what a good abstraction should be is
>>> done. What you think?
>>
>> Yes. IMHO writing a D tokenizer is a wasted effort. We need a tokenizer
>> generator.
>>
>
> Agreed, of all the things desired for D, a D tokenizer would rank pretty
> low I think.
>
> Another thing, even though a tokenizer generator would be much more
> desirable, I wonder if it is wise to have that in the standard library?
> It does not seem to be of wide enough interest to be in a standard
> library. (Out of curiosity, how many languages have such a thing in
> their standard library?)

Even C has strtok.

Andrei


More information about the Digitalmars-d mailing list