On 7 August 2012 16:56, jerro <span dir="ltr"><<a href="mailto:a@a.com" target="_blank">a@a.com</a>></span> wrote:<br><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

<div class="im"><br></div><div class="im">

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

That said, almost all simd opcodes are directly accessible in std.simd.<br>

There are relatively few obscure operations that don't have a representing<br>

function.<br>

The unpck/shuf example above for instance, they both effectively perform a<br>

sort of swizzle, and both are accessible through swizzle!().<br>

</blockquote>

<br></div>

They aren't. Swizzle only takes one argument, so you cant use it to select elements from two vectors. Both unpcklps and shufps take two arguments. Writing a swizzle with two arguments would be much harder.</blockquote>

<div><br></div><div>Any usages I've missed/haven't thought of; I'm all ears.</div><div><br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="im">


<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

The swizzle<br>

mask is analysed by the template, and it produces the best opcode to match<br>

the pattern. Take a look at swizzle, it's bloody complicated to do that the<br>

most efficient way on x86.<br>

</blockquote>

<br></div>

Now imagine how complicated it would be to write a swizzle with to vector arguments.</blockquote><div><br></div><div>I can imagine, I'll have a go at it... it's something I considered, but not all architectures can do it efficiently.</div>

<div>That said, a most-efficient implementation would probably still be useful on all architectures, but for cross platform code, I usually prefer to encourage people taking another approach rather than supply a function that is not particularly portable (or not efficient when ported).</div>

<div><br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="im">

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

The reason I didn't write the DMD support yet is because it was incomplete,<br>

and many opcodes weren't yet accessible, like shuf for instance... and I<br>

just wasn't finished. Stopped to wait for DMD to be feature complete.<br>

I'm not opposed to this idea, although I do have a concern that, because<br>

there's no __forceinline in D (or macros), adding another layer of<br>

abstraction will make maths code REALLY slow in unoptimised builds.<br>

Can you suggest a method where these would be treated as C macros, and not<br>

produce additional layers of function calls?<br>

</blockquote>

<br></div>

Unfortunately I can't, at least not a clean one. Using string mixins would be one way but I think no one wants that kind of API in Druntime or Phobos.</blockquote><div><br></div><div>Yeah, absolutely not.</div><div>This is possibly the most compelling motivation behind a __forceinline mechanism that I've seen come up... ;)</div>

<div><br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="im">

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

I'm already unhappy that<br>

std.simd produces redundant function calls.<br><br>

<rant> please  please please can haz __forceinline! </rant><br>

</blockquote>

<br></div>

I agree that we need that.<br>

</blockquote></div><br><div>Huzzah! :)</div>