Multi-architecture binaries
janderson
askme at me.com
Wed May 2 08:53:18 PDT 2007
Don Clugston wrote:
> Jascha Wetzel wrote:
>> here is a much simpler version that works with templates. what is boils
>> down to is choosing one template instance at startup that will replace a
>> function pointer.
>>
>> now the only compiler support required would be a pragma or similar to
>> select the target architecture.
>
> A pragma would only be required as a size optimisation. Probably not
> worth worrying about (We have enough version information already).
>
>> this could also be used to manage multiple versions of BLADE code.
>
> It's a nice idea, but I don't know how it could generate the class to
> put the 'this()' function into (we don't want a memory alloc every time
> we enter that function!)
>
> Interestingly DDL could be fantastic for this. At startup, walk through
> the symbol fixup table, and look for any import symbols marked
> __cpu_fixup_XXX.
> When you find them, look for an export symbol called __cpu_SSE2_XXX, and
> patch them into everything in the the fixup list. That way, you even get
> a direct function call, instead of an indirect one.
>
> I wonder if it's possible to pop ESP off the stack, and write back into
> the code that called you, without the operating system triggering a
> security alert -- in that case, the function you call could be a little
> thunk, something like:
>
> asm {
> naked;
> mov eax, CPU_TYPE;
> mov eax, FUNCPOINTERS[eax];
> mov ecx, [esp-4]; // get the return address
> mov [ecx-4], eax; // patch the call address, so this thunk never gets
> called again.
> jmp [eax];
> }
>
> But I think a modern OS would go nuts if you try this?
> (It's been a long time since I wrote self modifying code).
That may be the case. Also if the code is only called once, it would
cause a huge cache miss that would last for many nano-seconds.
If this is happen a lot the code would keep spiking over over the place
(for the first few seconds of the app and then when you hit code that
hasn't been used before).
A better approach would be to figure them out in large batches, perhaps
per-module level. That way you get less cache-misses.
Nice idea though.
-Joel
More information about the Digitalmars-d
mailing list