Any usable SIMD implementation?

Johannes Pfau via Digitalmars-d digitalmars-d at puremagic.com
Thu Apr 7 07:46:06 PDT 2016


Am Thu, 07 Apr 2016 13:27:05 +0000
schrieb Johan Engelen <j at j.nl>:

> On Thursday, 7 April 2016 at 11:25:47 UTC, Johannes Pfau wrote:
> > Am Thu, 07 Apr 2016 10:52:42 +0000
> > schrieb Kai Nacke <kai at redstar.de>:
> >  
> >> glibc has a special mechanism for resolving the called 
> >> function during loading. See the section on the GNU Indirect 
> >> Function Mechanism here: 
> >> https://www.ibm.com/developerworks/community/wikis/home?lang=en#!/wiki/W51a7ffcf4dfd_4b40_9d82_446ebc23c550/page/Optimized%20Libraries
> >> 
> >> Would be awesome to have something similar in druntime/Phobos.
> >> 
> >> Regards,
> >> Kai  
> >
> > Available in GCC as the 'ifunc' attribute: 
> > https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html#Common-Function-Attributes
> >
> > What do you mean by 'something similar in druntime/phobos'? A 
> > platform independent (slightly slower) variant?:
> >
> > http://dpaste.dzfl.pl/0aa81325a26a  
> 
> I thought that the ifunc mechanism means an indirect call (i.e. a 
> function ptr is set at the start of the program) ? That would be 
> the same as what you are doing without performance difference.
> 
> https://gcc.gnu.org/wiki/FunctionMultiVersioning
> "To keep the cost of dispatching low, the IFUNC mechanism is used 
> for dispatching. This makes the call to the dispatcher a one-time 
> thing during startup and a call to a function version is a single 
> jump ** indirect ** instruction." (emphasis mine)

The simple variant I've posted needs an additional branch on every
function call. If we instead initialize the function pointer in a
shared static ctor there's indeed no performance difference. The main
problem here is because of cyclic constructor detection it will be more
difficult to implement a generic template solution.

http://www.airs.com/blog/archives/403
"An alternative to all this linker stuff would be a variable holding a
function pointer. The function could then be written in assembler to do
the indirect jump. The variable would be initialized at program startup
time. The efficiency would be the same. The address of the function
would be the address of the indirect jump, so function pointers would
compare consistently."

> I looked into this some time ago and did not see a reason to use 
> the ifunc mechanism (which would not be available on Windows). I 
> thought it should be implementable in a library, exactly as you 
> did in your dpaste! :-)

> (does `&foo` return `impl`?)

No, &foo will return the address of the wrapper function. I'm not sure
if we can solve this. IIRC we can't overload &. Here's the alternative
using a constructor which makes the address accessible. The syntax
will still be different though:

__gshared void function() foo;
shared static this()
{
    foo = &foo1;
}

auto addr = &foo; // address of the variable
addr = cast(void*)foo; // the function address


More information about the Digitalmars-d mailing list