Any usable SIMD implementation?

9il via Digitalmars-d digitalmars-d at puremagic.com
Thu Apr 7 03:03:50 PDT 2016


On Thursday, 7 April 2016 at 09:41:06 UTC, Walter Bright wrote:
> On 4/7/2016 12:59 AM, 9il wrote:
>> 1. Executable size will grow with every instruction set release
>
> Yes, and nobody cares. With virtual memory and demand loading, 
> unexecuted code will never be loaded off of disk and will never 
> consume memory space. And with a 64 bit address space, there 
> will never be a shortage of virtual address space.
>
> It will consume space on your 1 terabyte drive. Meh. I have 
> several of those drives, and what consumes space is video, not 
> code binaries :-)
>

what about 1GB game 2D for a Phone, or maybe a clock?

>
>> 3. This would not solve the problem for generic BLAS 
>> implementation for Phobos
>> at all! How you would force compiler to USE and NOT USE 
>> specific vector
>> permutations for example in the same object file? Yes, I know, 
>> DMD has not
>> permutations. No, I don't want to write permutation for each 
>> architecture. Why?
>> I can write simple D code that generates single LLVM IR code 
>> which would work
>> for ALL targets!
>
> There's no reason for the compiler to make target CPU 
> information available when writing generic code.

This is not true for BLAS based on D. You don't want to see the 
opportunities. The final result of your dogmatic decision would 
make code slower for DMD, but LDC and GDC would implement 
required simple features. I just wanted to write fast code for 
DMD too.


More information about the Digitalmars-d mailing list