DConf 2013 Day 3 Talk 5: Effective SIMD for modern architectures by Manu Evans
Manu
turkeyman at gmail.com
Thu Jun 20 06:11:30 PDT 2013
On 20 June 2013 21:58, bearophile <bearophileHUGS at lycos.com> wrote:
> Andrei Alexandrescu:
>
> http://youtube.com/watch?v=q_**39RnxtkgM<http://youtube.com/watch?v=q_39RnxtkgM>
>>
>
> Very nice.
>
> - - - - - - - - - - - - - - - - - - -
>
> Slide 3:
>
> In practise, say we have iterative code like this:
>>
>> int data[100];
>>
>> for(int i = 0; i < data.length; ++i) {
>> data[i] += 10; }
>>
>
> For code like that in D we have vector ops:
>
> int[100] data;
> data[] += 10;
>
>
> Regarding vector ops: currently they are written with handwritten asm that
> uses SIMD where possible. Once std.simd is in good shape I think the array
> ops can be rewritten (and completed in their missing parts) using a higher
> level style of coding.
>
I was trying to illustrate a process. Not so much a comment on D array
syntax.
The problem with auto-simd applied to array operations, is D doesn't assert
that arrays are aligned. Nor are they multiples of 'N' elements wide, which
means they lose the opportunity to make a lot of assumptions that make the
biggest performance difference.
They must be aligned, and multiples of N elements. By using explicit SIMD
types, you're forced to adhere to those rules as a programmer, and the
compiler can optimise properly.
You take on the responsibility to handle mis-alignment and stragglers as
the programmer, and perhaps make less conservative choices.
- - - - - - - - - - - - - - - - - - -
>
> Slide 22:
>
> Comparisons:
>> Full suite of comparisons Can produce bit-masks, or boolean 'any'/'all'
>> logic.
>>
>
> Maybe a little of compiler support (for the syntax) will help here.
>
Well, each are valid comparisons in different situations. I'm not sure how
syntax could clearly select the one you want.
- - - - - - - - - - - - - - - - - - -
>
> Slide 26:
>
> Always pass vectors by value.
>>
>
> Unfortunately it seems a bad idea to give a warning if you pass one of
> those by reference.
>
And I don't think it should. Passing by ref isn't 'wrong', you just
shouldn't do it if you care about performance.
- - - - - - - - - - - - - - - - - - -
>
> Slide 27:
>
> 3. Use ‘leaf’ functions where possible.
>>
>
> I am not sure how much good it is to enforce leaf functions with a @leaf
> annotation.
>
I don't think it would be useful. It should only be considered a general
rule when people are very specifically considering performance above all
else.
It's just a very important detail to be aware of when optimising your code,
particularly so when you're dealing with maths code (often involving simd).
- - - - - - - - - - - - - - - - - - -
>
> Slide 32:
>
> Experiment with prefetching?
>>
>
> Are D intrinsics offering instructions to perform prefetching?
>
Well, GCC does at least. If you're worried about performance at this level,
you're probably already using GCC :)
- - - - - - - - - - - - - - - - - - -
>
> LDC2 is supports SIMD on Windows32 too.
>
> So for this code:
>
>
> void main() {
> alias double2 = __vector(double[2]);
> auto a = new double[200];
> auto b = cast(double2[])a;
> double2 tens = [10.0, 10.0];
> b[] += tens;
> }
>
>
> LDC2 compiles it to:
>
> movl $200, 4(%esp)
> movl $__D11TypeInfo_Ad6__initZ, (%esp)
> calll __d_newarrayiT
> movl %edx, %esi
> movl %eax, (%esp)
> movl $16, 8(%esp)
> movl $8, 4(%esp)
> calll __d_array_cast_len
> testl %eax, %eax
> je LBB0_3
> movapd LCPI0_0, %xmm0
> .align 16, 0x90
> LBB0_2:
> movapd (%esi), %xmm1
> addpd %xmm0, %xmm1
> movapd %xmm1, (%esi)
> addl $16, %esi
> decl %eax
> jne LBB0_2
> LBB0_3:
> xorl %eax, %eax
> addl $12, %esp
> popl %esi
> ret
>
>
> It uses addpd that works with two doubles at the same time.
>
Sure... did I say this wasn't supported somewhere? Sorry if I gave that
impression.
- - - - - - - - - - - - - - - - - - -
>
> The Reddit thread contains a link to this page, a compiler for a C variant
> from Intel that's optimized for SIMD:
> http://ispc.github.io/
>
> Some of the syntax of ispc:
>
> - - - - - -
>
> The first of these statements is cif, indicating an if statement that is
> expected to be coherent. The usage of cif in code is just the same as if:
>
> cif (x < y) {
> ...
> } else {
> ...
> }
>
> cif provides a hint to the compiler that you expect that most of the
> executing SPMD programs will all have the same result for the if condition.
>
> Along similar lines, cfor, cdo, and cwhile check to see if all program
> instances are running at the start of each loop iteration; if so, they can
> run a specialized code path that has been optimized for the "all on"
> execution mask case.
>
This is interesting. I didn't know about this.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.puremagic.com/pipermail/digitalmars-d-announce/attachments/20130620/4204394a/attachment-0001.html>
More information about the Digitalmars-d-announce
mailing list