A Friendly Challenge for D

Fri Oct 12 16:19:59 UTC 2018

On Friday, 12 October 2018 at 15:11:17 UTC, welkam wrote:
> On Wednesday, 10 October 2018 at 16:15:56 UTC, Jabari Zakiya 
> wrote:
>> What I am requesting here is for a person(s) who is an 
>> "expert" (very good) to create a very fast D version, using 
>> whatever tricks it has to maximize performance.
>>
>> I would like to include in my paper a good comparison of 
>> various implementations in different compiled languages 
>> (C/C++, D, Nim, etc) to show how it performs with each.
>
> I looked into your NIM code and from programmers point of view 
> there is nothing interesting going on. Simple data structures 
> and simple operations. If you wrote equivalent code in C, C++, 
> D, NIM, Rust, Zig and compiled with same optimizing compiler 
> (llvm or gcc) you should get the same machine code and almost 
> the same performance (less than 1% difference due to runtime). 
> If you got different machine code for equivalent implementation 
> then you should file a bug report.
>
> The only way you will get different performance is by changing 
> implementation details but then you would compare apples to 
> oranges.

Hmm,I don't think what you're saying about similar 
output|performance with other languages is empirically correct, 
but it's really not the point of the challenge.

The real point of the challenge is too see what idiomatic code, 
written for performance, using the best resources that the 
language provides, will produce compared, to the Nim version. 
It's not to see what a line-by-line translation from Nim to D 
would look like. That may be a start to get something working, 
but shouldn't be the end goal.

I'm using the Nim version here as the "reference implementation" 
so it can be used as the standard for comparison (accuracy of 
results and performance). The goal for D (et al) users is to use 
whatever resources it provides to maybe do better.

Example. Nim currently doesn't provide standard bitarrays. Using 
bitarrays in place of byte arrays should perform faster because 
more data can fit in cache and operate faster.

Also, to parallelize the algorithm maybe using OpenMP, CUDA, etc 
is the way to do it for D. I don't know what constructs D uses 
for parallel multiprocessing. And as noted before, this 
algorithms screams out to be done with GPUs.

But you are correct that the Nim code uses very simple coding 
operations. That is one of its beauties! :) It is simple to 
understand and implement mathematically, short and simple to 
code, and architecturally adaptable to hardware.

So to really do the challenge, the Nim code needs to be compiled 
and run (per instructions in code) to use as the "reference 
implementation", to see what correct outputs look like, and their 
times, and then other implementations can be compared to it.

I would hope, after getting an output correct implementation done 
(to show you really know what you're doing) then alternative 
implementations can be done to wring out better performance.

I think this is a good challenge for anyone wanting to learn D 
too, because it involves something substantially more than a 
"toy" algorithm, but short enough to do with minimal time and 
effort, that involves the need to know (learn about) D in enough 
detail to determine the "best" (alternative) way to do it.

Finally, a really fast D implementation can be a marketing 
bananza to show people in the numerical analysis, data|signal 
processing fields, et al, that D can be used by them to solve 
their problems and be more performant than C++, etc.

Again, people should feel free to email me if the want more 
direct answers to questions, or help.