Scientific computing and parallel computing C++23/C++26
Ola Fosheim Grøstad
ola.fosheim.grostad at gmail.com
Fri Jan 14 15:17:59 UTC 2022
On Friday, 14 January 2022 at 01:39:32 UTC, Nicholas Wilson wrote:
> On Thursday, 13 January 2022 at 22:27:27 UTC, Ola Fosheim
> Grøstad wrote:
>> Are there some performance benchmarks on modest hardware?
>> (e.g. a standard macbook, imac or mac mini) Benchmarks that
>> compares dcompute to CPU with auto-vectorization (SIMD)?
>
> Part of the difficulty with that, is that it is an apples to
> oranges comparison. Also I no longer have hardware that can run
> dcompute, as my old windows box (with intel x86 and OpenCL 2.1
> with an nvidia GPU) died some time ago.
>
> Unfortunately Macs and dcompute don't work very well. CUDA
> requires nvidia, and OpenCL needs the ability to run SPIR-V
> (clCreateProgramWithIL call) which requires OpenCL 2.x which
> Apple do not support. Hence why supporting Metal was of some
> interest. You might in theory be able to use PoCL or intel
> based OpenCL runtimes but I don't have an intel mac anymore and
> I haven't tried PoCL.
**\*nods**\* For a long time we could expect "home computers" to
be Intel/AMD, but then the computing environment changed and
maybe Apple tries to make its own platform stand out as faster
than it is by forcing developers to special case their code for
Metal rather than going through a generic API.
I guess FPGAs will be available in entry level machines at some
point as well. So, I understand that it will be a challenge to
get *dcompute* to a "ready for the public" stage when there is no
multi-person team behind it.
But I am not so sure about the apples and oranges aspect of it.
The presentation by Bryce was quite explicitly focusing on making
GPU computation available at the same level as CPU computations
(sans function pointers). This should be possible for homogeneous
memory systems (GPU and CPU sharing the same memory bus) in a
rather transparent manner and languages that plan for this might
be perceived as being much more productive and performant if/when
this becomes reality. And C++23 isn't far away, if they make the
deadline.
It was also interesting to me that ISO C23 will provide custom
bit width integers and that this would make it easier to
efficiently compile C-code to tighter FPGA logic. I remember that
LLVM used to have that in their IR, but I think it was taken out
and limited to more conventional bit sizes? It just shows that
being a system-level programming language requires a lot of
adaptability over time and frameworks like *dcompute* cannot ever
be considered truly finished.
More information about the Digitalmars-d
mailing list