DConf 2013 Day 3 Talk 5: Effective SIMD for modern architectures by Manu Evans
bearophile
bearophileHUGS at lycos.com
Thu Jun 20 07:03:39 PDT 2013
Manu:
> They must be aligned, and multiples of N elements.
The D GC currently allocates them 16-bytes aligned (but if you
slice the array you can lose some alignment). On some new CPUs
the penalty for misalignment is small.
You often have "n" values, where n is variable. If n is large
enough and you are using D vector ops, the handling of the head
and tail doesn't waste too much time. If you have very few values
it's much better to use the SIMD code.
> Well, each are valid comparisons in different situations. I'm
> not sure how syntax could clearly select the one you want.
Maybe later we'll look for some syntax sugar for this.
>> Are D intrinsics offering instructions to perform prefetching?
>
> Well, GCC does at least. If you're worried about performance at
> this level, you're probably already using GCC :)
I think D SIMD programmers will expect something functionally
like __builtin_prefetch to be available in D too:
http://gcc.gnu.org/onlinedocs/gcc/Other-Builtins.html#index-g_t_005f_005fbuiltin_005fprefetch-3396
Thank you,
bye,
bearophile
More information about the Digitalmars-d-announce
mailing list