Vector performance

Fri Jan 13 02:37:57 PST 2012

On 13 January 2012 04:16, Marco Leise <Marco.Leise at gmx.de> wrote:
> Am 12.01.2012, 16:40 Uhr, schrieb Iain Buclaw <ibuclaw at ubuntu.com>:
>
>> On 12 January 2012 08:29, Manu <turkeyman at gmail.com> wrote:
>>>
>>> On 12 January 2012 02:46, F i L <witte2008 at gmail.com> wrote:
>>>>
>>>>
>>>> Well the idea is you can have both. You could even have a:
>>>>
>>>>  Vector2!(Transition!(Vector4!(Transition!float))) // headache
>>>>  or something more practical...
>>>>
>>>>  Vector4!(Vector4!float) // Matrix4f
>>>>  Vector4!(Transition!(Vector4!float)) // Smooth Matrix4f
>>>>
>>>> Or anything like that. I should point out that my example didn't make it
>>>> clear that a Matrix4!(Transition!float) would be pointless compared to
>>>> Transition!(Matrix4!float) unless each Transition held it's own
>>>> iteration
>>>> value. Example:
>>>>
>>>>  struct Transition(T, bool isTimer = false) {
>>>>
>>>>      T value, start, target;
>>>>      alias value this;
>>>>
>>>>      static if (isTimer) {
>>>>          float time, speed;
>>>>
>>>>          void update() {
>>>>              time += speed;
>>>>              value = start + ((target - start) * time);
>>>>          }
>>>>      }
>>>>  }
>>>>
>>>> That way each channel could update on it's own time frame. There may
>>>> even
>>>> be a way to have each channel be it's own separate Transition type.
>>>> Which
>>>> could be interesting. I'm still playing with possibilities.
>>>
>>>
>>>
>>> The vector's aren't quite like that.. you can't make a hardware vector
>>> out
>>> of anything, only things the hardware supports: __vector(float[4]) for
>>> instance.
>>> You can make your own vector template that wraps those I guess if you
>>> want
>>> to make a matrix that way, but it sounds inefficient. When it comes to
>>> writing the vector/matrix operations, if you're assuming generic code,
>>> you
>>> won't be able to make it anywhere near as good as if you write a
>>> Matrix4x4
>>> class.
>>>
>>>
>>>>> I think that is also possible if that's what you want to do, and I see
>>>>> no
>>>>> reason why any of these constructs wouldn't be efficient (or
>>>>> supported).
>>>>> You can probably even try it out now with what Walter has already
>>>>> done...
>>>>
>>>>
>>>>
>>>> Cool, I was unaware Walter had begun implementing SIMD operations. I'll
>>>> have to build DMD and test them out. What's the syntax like right now?
>>>
>>>
>>>
>>> The syntax for the types (supporting basic arithmetic) look like
>>> __vector(float[4]) float4vector.. Try it on the latest GDC.
>>>
>>
>> This will change.  I'm uploading core.simd later which has a Vector!()
>> template, and aliases for vfloat4, vdouble2, vint4, etc...
>>
>> I don't plan on implementing vector instrinsics in the same way Walter
>> is doing it.
>>
>> a)  GCC already prodives it's own intrinsics
>> b) The intrinsics I see Walter has already implemented in core.simd is
>> restricted to x86 line of architectures.
>>
>>
>> Regards
>
>
> Looks like you two should discuss this. I see how Walter envisioned D to
> have an inline assembler unlike C, which resulted in several vendor specific
> syntaxes and how GCC has already done the bulk load of work to support SIMD
> and multiple platforms. Naturally you don't want to redo that work to wrap
> Walter's immature approach around the solid base in GDC.
> Can you please have a meeting together with the LDC devs and decide on a
> fair way for everyone to support inline ASM and SIMD intrinsics? Once there
> is a common ground for three compilers other compilers will want to go the
> same route and everyone is happy with source code that can be compiled by
> every compiler.
> I think this is a fundamental decision for a systems programming language.

Who are the LDC devs? :)

-- 
Iain Buclaw

*(p < e ? p++ : p) = (c & 0x0f) + '0';