Labels as values and threaded-code interpretation

Alex Rønne Petersen alex at lycus.org
Sat Jun 1 19:30:37 PDT 2013


On 01-06-2013 07:41, Manu wrote:
> GCC has a non-standard extension to do this, I've used it to get great
> performance wins writing emulators. I love this feature, love to see it
> in D!

Yes, it's basically essential for high-perf interpreters. It's a feature 
that a systems language like D must have.

>
> On 1 Jun 2013 15:30, "Alex Rønne Petersen" <alex at lycus.org
> <mailto:alex at lycus.org>> wrote:
>
>     Hi,
>
>     I'm sure this has been brought up before, but I feel I need to bring
>     it up again (because I'm going to be writing a threaded-code
>     interpreter):
>     http://gcc.gnu.org/onlinedocs/__gcc/Labels-as-Values.html
>     <http://gcc.gnu.org/onlinedocs/gcc/Labels-as-Values.html>
>
>     This is an incredibly important extension. The final switch
>     statement is not a replacement because it doesn't allow the
>     programmer to store a label address directly into a code stream,
>     which is what's essential to write a threaded-code interpreter.
>
>     The Erlang folks went through hell just to use this feature; see the
>     5th Q at:
>     http://www.erlang.org/doc/__installation_guide/INSTALL-__WIN32.html#Frequently-Asked-__Questions
>     <http://www.erlang.org/doc/installation_guide/INSTALL-WIN32.html#Frequently-Asked-Questions>
>
>     The idea is to be able to write code like this:
>
>     ----
>
>     import std.algorithm;
>
>     enum Op : ubyte
>     {
>          imm,
>          add,
>          sub,
>          // ...
>          ret,
>     }
>
>     final class Insn
>     {
>          Op op;
>          size_t[] args;
>          void* lbl;
>          Insn next;
>     }
>
>     final class State
>     {
>          Insn pc;
>          size_t[64] regs;
>     }
>
>     size_t interp(Insn[] code)
>     {
>          // Set up the instruction stream with label addresses
>          // the first time that it is executed. Label addresses
>          // are stable, so we only do this once.
>
>          foreach (insn; code.filter!(x => !x.lbl)())
>          {
>              void* lbl;
>
>              with (Op)
>              {
>                  final switch (insn.op)
>                  {
>                      case imm: lbl = &&handle_imm; break;
>                      case add: lbl = &&handle_add; break;
>                      case sub: lbl = &&handle_sub; break;
>                      // ...
>                      case ret: lbl = &&handle_ret; break;
>                  }
>              }
>
>              insn.lbl = lbl;
>          }
>
>          // Start interpreting the entry instruction.
>
>          auto state = new State;
>          state.pc = code[0];
>          goto *state.pc.lbl;
>
>          // Interpreter logic follows...
>
>          // The whole point is to avoid unnecessary function
>          // calls, so we use a mixin template for this stuff.
>          enum advance = "state.pc = state.pc.next;" ~
>                         "goto *state.pc.lbl;";
>
>          handle_imm:
>          {
>              state.regs[state.pc.args[0]] = state.pc.args[1];
>              mixin(advance);
>          }
>
>          handle_add:
>          {
>              state.regs[state.pc.args[0]] = state.regs[state.pc.args[1]]
>     + state.regs[state.pc.args[2]];
>              mixin(advance);
>          }
>
>          handle_sub:
>          {
>              state.regs[state.pc.args[0]] = state.regs[state.pc.args[1]]
>     - state.regs[state.pc.args[2]];
>              mixin(advance);
>          }
>
>          // ...
>
>          handle_ret:
>          {
>              return state.regs[state.pc.args[0]];
>          }
>
>          assert(false);
>     }
>
>     ----
>
>     Notice that there are no function calls going on. Just plain jumps
>     all the way through. This is a technique that many real world
>     interpreters use to get significant speedups, and I for one think we
>     desperately need it.
>
>     Implementing it should be trivial as both LLVM and GCC support
>     taking the address of a block. I'm sure the DMD back end could be
>     extended to support it too.
>
>     --
>     Alex Rønne Petersen
>     alex at alexrp.com <mailto:alex at alexrp.com> / alex at lycus.org
>     <mailto:alex at lycus.org>
>     http://alexrp.com / http://lycus.org
>


-- 
Alex Rønne Petersen
alex at alexrp.com / alex at lycus.org
http://alexrp.com / http://lycus.org


More information about the Digitalmars-d mailing list