Scientific computing and parallel computing C++23/C++26
Nicholas Wilson
iamthewilsonator at hotmail.com
Sat Jan 15 00:29:20 UTC 2022
On Friday, 14 January 2022 at 15:17:59 UTC, Ola Fosheim Grøstad
wrote:
> **\*nods**\* For a long time we could expect "home computers"
> to be Intel/AMD, but then the computing environment changed and
> maybe Apple tries to make its own platform stand out as faster
> than it is by forcing developers to special case their code for
> Metal rather than going through a generic API.
>
> I guess FPGAs will be available in entry level machines at some
> point as well. So, I understand that it will be a challenge to
> get *dcompute* to a "ready for the public" stage when there is
> no multi-person team behind it.
Maybe, but I suspect not for a while though, but that could be
wildly wrong. Anyway, I don't think they will be too difficult to
support, provided the vendor in question provides an OpenCL
implementation. The only thing to do is support `pipe`s.
As for manpower, the reason is I don't have any personal
particular need for dcompute these days. I am happy to do
features for people that need something in particular, e.g.
Vulkan compute shader, textures, and PR are welcome. Though if
Bruce makes millions and gives me a job then that will obviously
change ;)
> But I am not so sure about the apples and oranges aspect of it.
The apples to oranges comment was about doing benchmarks with CPU
vs. GPU, there are so many factors that make performance
comparisons (more) difficult. Is the GPU discrete? How important
is latency vs. throughput? How "powerful" is the GPU compared to
the CPU?How well suited to the task is the GPU? The list goes on.
Its hard enough to do CPU benchmarks in an unbiased way.
If the intention is to say, "look at the speedup you can for for
$TASK using $COMMON_HARDWARE" then yeah, that would be possible.
It would certainly be possible to do a benchmark of, say, "ease
of implementation with comparable performance" of dcopmute vs
CUDA, e.g. LoC, verbosity, brittleness etc., since the main
advantage of D/dcompute (vs CUDA) is enumeration of kernel
designs for performance. That would give a nice measurable goal
to improve usability.
> The presentation by Bryce was quite explicitly focusing on
> making GPU computation available at the same level as CPU
> computations (sans function pointers). This should be possible
> for homogeneous memory systems (GPU and CPU sharing the same
> memory bus) in a rather transparent manner and languages that
> plan for this might be perceived as being much more productive
> and performant if/when this becomes reality. And C++23 isn't
> far away, if they make the deadline.
Definitely. Homogenous memory is interesting for the ability to
make GPUs do the things GPUs are good at and leave the rest to
the CPU without worrying about memory transfer across the PCI-e.
Something which CUDA can't take advantage of on account of nvidia
GPUs being only discrete. I've no idea how cacheing work in a
system like that though.
> It was also interesting to me that ISO C23 will provide custom
> bit width integers and that this would make it easier to
> efficiently compile C-code to tighter FPGA logic. I remember
> that LLVM used to have that in their IR, but I think it was
> taken out and limited to more conventional bit sizes?
Arbitrary Precision integers are still a part of LLVM, and I
presume LLVM IR. the problem with that is, like with addressed
spaced pointers, D has no way to declare such types. I seem to
remember Luís Marqeus doing something crazy like that (maybe in a
dconf presentation?), compiling D to verilog.
> It just shows that being a system-level programming language
> requires a lot of adaptability over time and frameworks like
> *dcompute* cannot ever be considered truly finished.
Of course.
More information about the Digitalmars-d
mailing list