Multi-architecture binaries

Wed May 2 09:10:45 PDT 2007

> It's a nice idea, but I don't know how it could generate the class to
> put the 'this()' function into (we don't want a memory alloc every time
> we enter that function!)

i'm not sure what you mean. i thought of something like this:

void foo(uint arch)()
{
  auto p = Vec!(arch)([3.5, 1.1, 3.8]);
  auto r = Vec!(arch)([17.0f, 28.25, 1])
  p *= dot(p,r);
}

the template parameter to Vec could choose the target used by BLADE (x87
or SSE vor example). the result is a class with multiple instances of
foo (since all desired instances appear in the static c'tor).
the static c'tor chooses one of the instances (depending on hardware
availability or benchmarks) and copies it's address to the init-data in
the classinfo. everytime the class is instantiated, the function-pointer
will automatically be initialized with the chosen pointer - no
self-modifying code necessary.
instead of changing the init-data, we can also modify the VTBL in the
classinfo (that's what the first version of this example did).

Don Clugston wrote:
> Jascha Wetzel wrote:
>> here is a much simpler version that works with templates. what is boils
>> down to is choosing one template instance at startup that will replace a
>> function pointer.
>>
>> now the only compiler support required would be a pragma or similar to
>> select the target architecture.
> 
> A pragma would only be required as a size optimisation. Probably not
> worth worrying about (We have enough version information already).
> 
>> this could also be used to manage multiple versions of BLADE code.
> 
> It's a nice idea, but I don't know how it could generate the class to
> put the 'this()' function into (we don't want a memory alloc every time
> we enter that function!)
> 
> Interestingly DDL could be fantastic for this. At startup, walk through
> the symbol fixup table, and look for any import symbols marked
> __cpu_fixup_XXX.
> When you find them, look for an export symbol called __cpu_SSE2_XXX, and
> patch them into everything in the the fixup list. That way, you even get
> a direct function call, instead of an indirect one.
> 
> I wonder if it's possible to pop ESP off the stack, and write back into
> the code that called you, without the operating system triggering a
> security alert -- in that case, the function you call could be a little
> thunk, something like:
> 
> asm {
>   naked;
>   mov eax, CPU_TYPE;
>   mov eax, FUNCPOINTERS[eax];
>   mov ecx, [esp-4]; // get the return address
>   mov [ecx-4], eax; // patch the call address, so this thunk never gets
> called again.
>   jmp [eax];
> }
> 
> But I think a modern OS would go nuts if you try this?
> (It's been a long time since I wrote self modifying code).