<div class="gmail_quote">On 10 January 2012 08:09, Martin Nowak <span dir="ltr"><<a href="mailto:dawg@dawgfoto.de">dawg@dawgfoto.de</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

Am 07.01.2012, 21:44 Uhr, schrieb Piotr Szturmaj <<a href="mailto:bncrbme@jadamspam.pl" target="_blank">bncrbme@jadamspam.pl</a>>:<div><div class="h5"><br>

<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

The idea is to make versions of code that are environment dependent and never change during runtime, _without_ resorting to if statements. This statement would be valid only inside function bodies.<br>

<br>

Examples of such versions may be:<br>

* supported SIMD CPU extensions MMX, SSE, SSE2, etc.<br>

* AMD vs Intel CPU, to use instructions that are not available on both<br>

* different OS versions (XP/Vista/7, Linux kernel versions)<br>

<br>

Why that instead of current if statement?<br>

* some additional speed, avoids multiple checks in frequent operations<br>

* making specific executables (f.i. SSE4 only) by limiting set of supported runtime options during compile time<br>

<br>

Code example:<br>

<br>

void main()<br>

{<br>

     version(rt_SSE4)<br>

     {<br>

         ...<br>

     }<br>

     else version(rt_SSE2)<br>

     {<br>

         ...<br>

     }<br>

     else<br>

     {<br>

         // portable code<br>

     }<br>

}<br>

<br>

In this example program checks the supported extensions only once, before calling main(). Then it modifies the function code to make it execute only versions that match.<br>

<br>

Runtime version identifiers may be set inside shared static constructors of modules (this implies that rt-version may not be used inside of them). SIMD extensions would preferably be set by druntime with help of core.cpuid.<br>


<br>

Code modification mechanism is up to implementation. One that come to my mind is inserting unconditional jumps by the compiler then and fixing them up before calling main().<br>

<br>

Additional advantage is possibility to generate executables for particular environments. This may help reduce execucutable size when targeting specific CPU, especially some constrained/embedded system. Also many cpuid checks inside druntime may be avoided.<br>


<br>

Just thinking loud :)<br>

</blockquote>

<br></div></div>

Because it could only fix non-inlined code you<br>

can as well use lazy binding using thunks.<br>

<br>

// use static to make it re-entrant safe<br>

__gshared R function(Args) doSomething = &setThunk;<br>

<br>

R setThunk(Args args)<br>

{<br>

   if (sse4)<br>

   {<br>

      doSomeThing = &sse4Impl;<br>

   }<br>

   else if (sse2)<br>

   {<br>

      doSomeThing = &sse2Impl;<br>

   }<br>

   else<br>

   {<br>

      doSomeThing = &nativeImpl;<br>

   }<br>

<br>

   return doSomeThing(args);<br>

}<br>

<br>

Much simpler, thread safe and more efficient.<br>

<br>

__gshared SSE2 = tuple("foo", &sse2Foo, "bar", &sse2Bar);<br>

__gshared SSE4 = tuple("foo", &sse4Foo, "bar", &sse4Bar);<br>

__gshared typeof(SSE2)* _impl;<br>

<br>

shared static this()<br>

{<br>

   if (sse4)<br>

   {<br>

      _impl = &SSE4;<br>

   }<br>

   else if (sse2)<br>

   {<br>

      _impl = &SSE2;<br>

   }<br>

   else<br>

   {<br>

      _impl = &Native;<br>

   }<br>

}<br>

<br>

_impl.foo(args);<br>

</blockquote></div><br><div>Function pointers are super-slow on some architectures. I don't think it's a particularly good solution unless the functions you're calling do a lot of work.</div>