Computed gotos on Reddit
Dmitry Olshansky
dmitry.olsh at gmail.com
Tue Jul 24 15:33:19 PDT 2012
On 25-Jul-12 01:40, Walter Bright wrote:
> On 7/24/2012 12:50 PM, Dmitry Olshansky wrote:
>>> are the same (same as in same number of indirections).
>>>
>>> switch (code[pc++])
>>>
>>> and
>>>
>>> goto code[pc++]()
>>>
>>> are the same, too.
>>
>> It's not. Let's get to assembly then. It's an illustration as I'm no
>> expert and
>> may have made some illegal shortcuts in this listing.
>>
>> goto code[pc++]() is roughly:
>>
>> mov ecx, [ebx]
>> inc ebx
>> jmp [ecx]
>
> jmp code[ecx]
>
There is no code just jump [ecx]. The offset is included already.
>
>> switch(code[pc++]) is:
>>
>> mov ecx, [ebx]
>> inc ebx
>> mov ecx, [edx+ecx] // assuming jump table is at edx
>> jump [ecx]
>
> jmp jumptable[ecx]
Damn, it's not the same. If in the above ecx is pc++, here it it
code[pc++] i.e. needs to be loaded. The whole point. And yes, I *didn't*
object that it's still 1 JUMP. I'm more focused on extra work being done.
> Please post source code example so I understand what you mean.
OK. It sure gets confusing.
Here is an article that shows the big picture of to what I want to do:
http://www.complang.tuwien.ac.at/anton/lvas/sem06w/revucky.pdf
It contains some advanced techniques that are out of scope of current
topic but Introduction & Background are short and full of relevant facts.
And for the sample code here it is, casts are ommited for brevity.
First one is "if I had a way to have enforced tail call to function or
take address of label".
Let's assume labels:
//GLOBALS
size_t[] bytecode = firstPassCompile("my_source.ds"); //say it's
DMDscript ;)
bool jitted = false;
void run(size_t pc)
{
//so here bytecode is normal bytecode if jitted != true
//for simplicity sake it starts with some number e.g.:
// 0 - op_1
// 1 - op_2, etc.
if(!jitted)
{
int i=0;
//1:1 map of each label that servers specific opcode
static tabulated_ops[] =
[ &L_op1, &L_op2, &L_op3, ... ];
while(notEndOfProgram(bytecode[i]) )
{
size_t n = bytecode[i];
//store real target
bytecode[i] = tabulated_ops[n];
//advance by some size, skipping operands etc.
i += opSize(n);
}
}
//interpret:
goto bytecode[pc];
//<---- never gets here
L_op1:
//do my op1 thing
pc += x;
//DISPATH:
goto bytecode[pc]; // load address at pc & jump to it
L_op2:
//do some other thing
pc += y;
//DISPATH:
goto bytecode[pc]; // load address at pc & jump to it
L_op3:
//my other op, jumps back
pc -= ...;
//DISPATH:
goto bytecode[pc]; // load address at pc & jump to it
...
L_opN:
...
goto bytecode[pc]; // load address at pc & jump to it
}
Now the same thing with switches.
And before you ask -
NO! Single switch won't do as it would have successful branch prediction
rate of as bad as about 2% (see the article!). The more opcode types the
worse prediction is.
Anyway here it goes:
//GLOBALS
size_t[] bytecode = firstPassCompile("my_source.ds"); //say it's
DMDscript ;)
void run(size_t pc)
{
//here we don't JIT to real addresses beforehand
//as jump tables somehow should be good enough
// Okay...
//interpret:
switch(bytecode[pc])
{
L_op1:
case 0:
//do my op1 thing
pc += x;
//DISPATCH:
//here comes trouble - 1st of N nearly identical tables??
switch(bytecode[pc])
{
case 0: goto L_op1;
case 1: goto L_op2;
...
}
L_op2:
case 1:
//do some other thing
pc += y;
//DISPATCH:
//here comes trouble - 2nd table?
switch(bytecode[pc])
{
case 0: goto L_op1;
case 1: goto L_op2;
...
}
L_op3:
case 2:
//my other op, jumps back
pc -= ...;
//DISPATCH:
//here comes trouble - 3rd table?
switch(bytecode[pc])
{
case 0: goto L_op1;
case 1: goto L_op2;
...
}
...
L_opN:
case N-1:
...
//DISPATCH:
//here comes trouble Nth table ... time to count overhead
switch(bytecode[pc])
{
case 0: goto L_op1;
case 1: goto L_op2;
...
}
}//end of giant switch
}
--
Dmitry Olshansky
More information about the Digitalmars-d
mailing list