Scientific computing and parallel computing C++23/C++26
bcarneal at gmail.com
Sat Jan 15 18:48:32 UTC 2022
On Saturday, 15 January 2022 at 17:29:35 UTC, Guillaume Piolat
> On Saturday, 15 January 2022 at 12:21:37 UTC, Ola Fosheim
> Grøstad wrote:
>>> Definitely. Homogenous memory is interesting for the ability
>>> to make GPUs do the things GPUs are good at and leave the
>>> rest to the CPU without worrying about memory transfer across
>>> the PCI-e. Something which CUDA can't take advantage of on
>>> account of nvidia GPUs being only discrete.
>> Steam Deck, which appears to come out next month, seems to run
>> under Linux and has an "AMD APU" with a modern GPU and CPU
>> integrated on the same chip
> Related: has anyone here seen an actual measured performance
> gain from co-located CPU and GPU on the same chip? I used to
> test with OpenCL + Intel SoC and again, it was underwhelming
> and not faster. I'd be happy to know about other experiences.
The link below on the vkpolybench software includes graphs for
integrated GPUs, among others, and shows significant (more than
SIMD width) speedups wrt a single CPU core for many of the
benchmarks but also break-even or worse on a few. Reports on
real world experiences with the integrated accelerators would be
On paper, at least, it looks like SoC GPU performance will be
severely impacted by the working set size but who isn't?.
Currently it also looks like the dcompute/SoC-GPU version will
beat out my SIMD variant but it'll be at least a few months
before I have hard data to share.
Anyone out there have real world data now?
More information about the Digitalmars-d