Should a parser type be a struct or class?

Stanislav Blinov stanislav.blinov at gmail.com
Wed Jun 17 12:35:09 UTC 2020


On Wednesday, 17 June 2020 at 11:50:27 UTC, Per Nordlöw wrote:
> Should a range-compliant aggregate type realizing a parser be 
> encoded as a struct or class? In dmd `Lexer` and `Parser` are 
> both classes.
>
> In general how should I reason about whether an aggregate type 
> should be encoded as a struct or class?

What's a range-compliant aggregate type? Ranges are typically 
views of someone else's data; an owner of the data woulnd't store 
mutable iterators, and won't be a range. For that reason also, 
ranges are structs, as most of them are thin wrappers over a set 
of iterators with an interface to mutate them.

If you *really* need runtime polymorphism as provided by the 
language - use a class. Otherwise - use a struct. It's pretty 
straightforward. Even then, in some cases one can realize their 
own runtime polymorphism without classes (look at e.g. Atila 
Neves' 'tardy' library).

It's very easy to implement a lexer as an input range: it'd just 
be a pointer into a buffer plus some additional iteration data 
(like line/column position, for example). I.e. a struct. Making 
it a struct also allows to make it into a forward range, instead 
of input range, which is useful if you need lookahead:

struct TokenStream
{
     this(SourceBuffer source)
     {
         this.cursor = source.text.ptr;
         advance(this);
     }

     bool empty() const
     {
         return token.type == TokenType.eof;
     }

     ref front() return scope const
     {
         return token;
     }

     void popFront()
     {
         switch (token.type)
         {
             default:
                 advance(this);
                 break;
             case TokenType.eof:
                 break;
             case TokenType.error:
                 token.type = TokenType.eof;
                 token.lexSpan = LexicalSpan(token.lexSpan.end, 
token.lexSpan.end);
                 break;
         }
     }

     TokenStream save() const
     {
         return this;
     }

private:

     const(char)* cursor;
     Location location;
     Token token;
}

, where `advance` is implemented as a module private function 
that actually parses source into next token.

DMD's Lexer/Parser aren't ranges. They're ourobori.


More information about the Digitalmars-d-learn mailing list