<div class="gmail_quote">On 8 January 2012 02:54, Peter Alexander <span dir="ltr"><<a href="mailto:peter.alexander.au@gmail.com">peter.alexander.au@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

<div class="im">I agree with Manu that we should just have a single type like __m128 in MSVC. The other types and their conversions should be solvable in a library with something like strong typedefs.</div></blockquote><div>

<br></div><div>Walter put in a reasonable effort to sway me to his side of the fence last night. I'm still not entirely sold that implementation inside the language is necessary to achieve these details, but I don't have enough background into to argue, and I'm not the one that has to maintain the code :)</div>

<div><br></div><div>Here are some points we discussed... how do we do these (efficiently) in a library?</div><div><br></div><div>** Literal syntax.. and constant folding:</div><div><br></div><div>Constants and literals also need to be aligned. If we use array syntax to express literals, this will be a problem.</div>

<div><br></div><div> int4 v = [ 1,2,3,4 ] + [ 5,6,7,8 ];</div><div><br></div><div>Any constant expressions need to be simplified at compile time: int4 vec = [ 6,8,10,12 ];</div><div>Perhaps this is possible with CTFE? Or will it be automatic if you express literals as if they were arrays?</div>

<div><br></div><div>** Expression interpretation/simplification:</div><div><br></div><div> float4 v = -b + a;</div><div><br></div><div>Obviously, this should be simplified to 'a - b'.</div><div><br></div><div> float4 v = a*b + c;</div>

<div><br></div><div>This should use a multiply-accumulate opcode on most architectures: FMADDPS v, a, b, c</div><div><br></div><div>** Typed debug info</div><div><br></div><div>In a debugger it's nice to inspect variables in their supposed type.</div>

<div>Can probably use unions to do this... probably wouldn't be as nice though.</div><div><br></div><div>** God knows what other optimisations</div><div><br></div><div>float4 v = [ 0,0,0,0 ]; // XOR v</div><div>etc...</div>

<div><br></div><div><br></div><div>I don't know what amount of this is achievable with libraries, but Walter seems to think this will all work much better in the language... I'm inclined to trust his judgement.</div>

</div>