A brainfuck parser works now with newCTFE

Sat Jul 29 18:55:54 PDT 2017

On Sunday, 30 July 2017 at 01:15:50 UTC, Stefan Koch wrote:
> [ ... ]

Shortly after posting this I came up with a more effective 
variant of the parseBf which reduces the generate bytecode from 
around 650 instructions to around 430 instructions.

const(RepeatedToken[]) parseBf(const string input) pure
{
     uint pos;
     RepeatedToken[] result;
     result.length = input.length + 2;
     // the maximal number of diffrent tokens is equal to the 
chars in the input
     // plus the begin and end token

     result[0] = Token(BFTokenEnum.ProgrammBegin);
     uint resultLen = 0;

     while (pos < input.length)
     {
         uint lastToken = (result[resultLen] >> 24);
         uint thisToken = BFTokenEnum.ProgrammEnd;
         final switch (input[pos++]) with (BFTokenEnum)
         {
         case '>':
             thisToken = IncPtr;
             break;
         case '<':
             thisToken = DecPtr;
             break;
         case '+':
             thisToken = IncVal;
             break;
         case '-':
             thisToken = DecVal;
             break;
         case '.':
             thisToken = OutputVal;
             break;
         case ',':
             thisToken = InputVal;
             break;
         case '[':
             thisToken = LoopBegin;
             break;
         case ']':
             thisToken = LoopEnd;
             break;
         case '\r':
             pos++;
             goto case '\n';
         case '\n':
             //TODO handle lines and proper position information;
             break;
         }

         if (lastToken == thisToken)
         {
             result[resultLen]++;
         }
         else if (thisToken != BFTokenEnum.ProgrammEnd)
         {
             result[++resultLen] = 
Token(cast(BFTokenEnum)thisToken);
         }

     }

     result[++resultLen] = Token(BFTokenEnum.ProgrammEnd);
     return result[0 .. resultLen + 1];
}

In theory it would be possible to get rid of the switch 
altogether ;)
And compress the function ever more.
But readability would suffer and we'd rely on the enum-members to 
correspond to the tokens, which maybe undesirable.