SIMD/intrinsincs questions

Tue Nov 10 12:40:31 PST 2009

Walter Bright Wrote:

> Mike Farnsworth wrote:
> > For my purposes, runtime detection is probably out the window, unless
> > the tests for it can happen infrequently enough to reduce the
> > overhead.  There are too many SSE variations to switch on them all,
> > and they incrementally provide better and better functionality that I
> > could make use of.  I'd rather compile different executables for
> > different hardware and distribute them all (e.g. detect the SSE
> > version at compile time).  Really, high performance graphics is an
> > exercise in getting tightly vectorized code to inline appropriately,
> > eliminate as many loads and stores as possible, and then on top of
> > that build algorithms that don't suck in runtime or memory/cache
> > complexity.
> 
> The way to do it is to not distribute multiple executables, but have the 
> initialization code detect the chip. Then, you compile the same code for 
> different instructions, and have a high level runtime switch between them.
> 
> I used to do this for machines with and without x87 support.

Was it actually rewriting the executable code to call the alternate functions (e.g. a exe load time decision, patch the code in memory, and then run)?  I thought that sort of thing would run into all sorts of runtime linker issues (ro code pages in memory, shared libs that also need the rewriting, etc.), but then again, they do that with JIT compiling all the time.

Does dmd already have some of this capability hanging around (but not used yet)?

-Mike