Why Strings as Classes?

Benji Smith dlanguage at benjismith.net
Tue Aug 26 13:32:43 PDT 2008


BCS wrote:
> Reply to Benji,
> 
>> I've used ANTLR a few times. It's nice.
>>
> 
> I've used it. If you gave me the choice of sitting in a cardboard small 
> box all day or using it again, I'll sit in the cardboard box because I 
> fit in that box better.

I've always been impressed by the capabilities of ANTLR. The ANTLRWorks 
IDE is a very cool way to develop and debug grammars, and Terrence Parr 
is one of those people that pushes the research into interesting new 
areas (he wrote something a few months ago about simplifying the 
deeply-recursive Expression grammar common in most languages that I 
found very insightful).

The architecture is pretty cool too. Text input in consumed and AST's 
are constructed using token grammars, which are then transformed using 
tree-grammars, and code-generation is performed by output grammars. It's 
a very elegant system, and I've seen some example projects that used a 
sequence of those grammars to translate code between different 
programming languages. It's cool stuff.

So I appreciate ANTLR from that perspective. I think the theory behind 
the project is top-notch.

But the syntax sucks. Badly. The learning curve is waaaay too steep for 
me, so I've always had to keep the documentation close by. And once the 
grammars are written, they're hard to read and maintain.

Also, there's a strong bias in the ANLTR community toward ASTs. I prefer 
to construct a somewhat higher-level parse tree. For example: given the 
expression "1 + 2", I'd like the parser to construct a BinaryOperator 
node, with two Expression node children and an enum "operator" field of 
"PLUS". I'd like it to use a set of pre-defined "parse model" classes 
that I've written to represent the language elements.

It's hard to do that kind of thing in ANTLR, which usually just creates 
a "+" node with children of "1" and "2".

The majority of my parser-generator experience has been with JavaCC, 
which leaves model-generation to the user, which works better for me.

--benji



More information about the Digitalmars-d mailing list