Labels as values and threaded-code interpretation

Sun Jun 2 11:25:30 PDT 2013

On 02-06-2013 19:58, Dmitry Olshansky wrote:
> 02-Jun-2013 21:49, Alex Rønne Petersen пишет:
>> On 02-06-2013 19:43, Dmitry Olshansky wrote:
>>> 02-Jun-2013 20:48, Alex Rønne Petersen пишет:
>>>> On 02-06-2013 10:52, Dmitry Olshansky wrote:
>>>>> 01-Jun-2013 20:13, Timon Gehr пишет:
>>>>>> On 06/01/2013 07:29 AM, Alex Rønne Petersen wrote:
>>>>>>> Hi,
>>>>>>>
>>>>>>> I'm sure this has been brought up before, but I feel I need to
>>>>>>> bring it
>>>>>>> up again (because I'm going to be writing a threaded-code
>>>>>>> interpreter):
>>>>>>> http://gcc.gnu.org/onlinedocs/gcc/Labels-as-Values.html
>>>>>>>
>>>>>>> This is an incredibly important extension. The final switch
>>>>>>> statement is
>>>>>>> not a replacement because it doesn't allow the programmer to store a
>>>>>>> label address directly into a code stream, which is what's
>>>>>>> essential to
>>>>>>> write a threaded-code interpreter.
>>>>>>>
>>>>>>
>>>>>> I'd also like to see this.
>>>>>>
>>>>>
>>>>> Same here.
>>>>>
>>>>> Though I believe a way to force tail-call can support the same use
>>>>> case
>>>>> also helping functional programming.
>>>>>
>>>>> Say:
>>>>>
>>>>> goto Call-Expression; //forced tail call
>>>>>
>>>>> instead of:
>>>>>
>>>>> return Call-Expression;
>>>>>
>>>>
>>>> I'm not sure that can support threaded-code interpretation.
>>>
>>> Why not:
>>>
>>> alias OpCode = function void();
>>>
>>> OpCode[] opcodes = [ opcode_1, ... ];
>>>
>>> int pc;
>>> ...
>>>
>>> void opcode_1()
>>> {
>>>      ... //pick operands do whatever
>>>      pc = pc + num_operands; //skip over arguments
>>>      OpCode next = cast(OpCode)bytecode[pc];
>>>      goto next(); //this is baked-in threaded dispatch
>>> }
>>>
>>> void opcode_2(){ ... }
>>>
>>> //say bytecode contains operands and addresses of fucntions
>>> void execute(size_t[] bytecode, int pc)
>>> {
>>>      OpCode start  = cast(OpCode)bytecode[pc];
>>>      pc++;
>>>      goto start();
>>> }
>>>
>>> One can get away without casting if data is in a separate array.
>>> Then this solution is perfectly safe in this limited form. Call would
>>> only point to a function hence no problem with jumping who knows where
>>> and less problems for optimizer.
>>>
>>
>> The problem here is assuming the interpreter state can be global. Once
>> you make it non-global (and thus have to pass it in the goto start(...)
>> call) you get all the overhead of a regular function call.
>>
>
> One pointer to a struct MyInterpreterState. Think of it as 'this'
> pointer :)

That's one load and store for every single step in the interpreter. 
Sounds harmless, but every cycle matters when every goto start(...) 
amounts to a single instruction in the program you're running.

> Anyhow cramming the whole interpreter in one huge function too has
> downsides (and hopelessly inefficent register allocation is one).

I'd frankly be surprised if that was the case. But without any 
investigation, it's just word against word.

> And you still access most locals via stack pointer. And then there got
> to be something beyond locals - 'context', that is still passed from one
> bytecode execution to another.

It depends on what you mean by context. Can you elaborate?

>
> BTW Globals are actually quite expansive to access anyway, this was a
> mock example.
>

Sure.

Either way, evidence suggests that this style of interpreter has 
performance advantages in practice.

-- 
Alex Rønne Petersen
alex at alexrp.com / alex at lycus.org
http://alexrp.com / http://lycus.org