Scientific computing and parallel computing C++23/C++26

Fri Jan 14 15:17:59 UTC 2022

On Friday, 14 January 2022 at 01:39:32 UTC, Nicholas Wilson wrote:
> On Thursday, 13 January 2022 at 22:27:27 UTC, Ola Fosheim 
> Grøstad wrote:
>> Are there some performance benchmarks on modest hardware? 
>> (e.g. a standard macbook, imac or mac mini) Benchmarks that 
>> compares dcompute to CPU with auto-vectorization (SIMD)?
>
> Part of the difficulty with that, is that it is an apples to 
> oranges comparison. Also I no longer have hardware that can run 
> dcompute, as my old windows box (with intel x86 and OpenCL 2.1 
> with an nvidia GPU) died some time ago.
>
> Unfortunately Macs and dcompute don't work very well. CUDA 
> requires nvidia, and OpenCL needs the ability to run SPIR-V 
> (clCreateProgramWithIL call) which requires OpenCL 2.x which 
> Apple do not support. Hence why supporting Metal was of some 
> interest. You might in theory be able to use PoCL or intel 
> based OpenCL runtimes but I don't have an intel mac anymore and 
> I haven't tried PoCL.

**\*nods**\* For a long time we could expect "home computers" to 
be Intel/AMD, but then the computing environment changed and 
maybe Apple tries to make its own platform stand out as faster 
than it is by forcing developers to special case their code for 
Metal rather than going through a generic API.

I guess FPGAs will be available in entry level machines at some 
point as well. So, I understand that it will be a challenge to 
get *dcompute* to a "ready for the public" stage when there is no 
multi-person team behind it.

But I am not so sure about the apples and oranges aspect of it. 
The presentation by Bryce was quite explicitly focusing on making 
GPU computation available at the same level as CPU computations 
(sans function pointers). This should be possible for homogeneous 
memory systems (GPU and CPU sharing the same memory bus) in a 
rather transparent manner and languages that plan for this might 
be perceived as being much more productive and performant if/when 
this becomes reality. And C++23 isn't far away, if they make the 
deadline.

It was also interesting to me that ISO C23 will provide custom 
bit width integers and that this would make it easier to 
efficiently compile C-code to tighter FPGA logic. I remember that 
LLVM used to have that in their IR, but I think it was taken out 
and limited to more conventional bit sizes? It just shows that 
being a system-level programming language requires a lot of 
adaptability over time and frameworks like *dcompute* cannot ever 
be considered truly finished.