Error running concurrent process and storing results in array

Fri May 8 13:36:22 UTC 2020

On Thursday, 7 May 2020 at 14:49:43 UTC, data pulverizer wrote:
> After running the Julia code by the Julia community they made 
> some changes (using views rather than passing copies of the 
> array) and their time has come down to ~ 2.5 seconds. The plot 
> thickens.

I've run the Chapel code past the Chapel programming language 
people and they've brought the time down to ~ 6.5 seconds. I've 
disallowed calling BLAS because I'm looking at the performance of 
the programming language implementations rather than it's ability 
to call other libraries.

So far the times are looking like this:

D:      ~ 1.5 seconds
Julia:  ~ 2.5 seconds
Chapel: ~ 6.5 seconds

I've been working on the Nim benchmark and have written a little 
byte order set of functions for big -> little endian stuff 
(https://gist.github.com/dataPulverizer/744fadf8924ae96135fc600ac86c7060) which was fun and has the ntoh, hton, and so forth functions that can be applied to any basic type. Now writing a little matrix type in the same vein as the D matrix type I wrote and then do the easy bit which is writing the kernel matrix algorithm itself.

In the end I'll run the benchmark on data of various sizes. 
Currently I'm just running it on the (10,000 x 784) data set 
which outputs a (10,000 x 10,000) matrix. I'll end up running 
(5,000 x 784), (10,000 x 784), (20,000 x 784), (30,000 x 784), 
(40,000 x 784), (50,000 x 784), and (60,000 x 784). Ideally I'd 
measure each on 100 times and plot confidence intervals, but I'll 
have to settle for measuring each one 3 times and take an average 
otherwise it will take too much time. I don't think that D will 
have it it's own way for all the data sizes, from what I can see, 
Julia may do better at the largest data set, maybe simd will be a 
factor there.

The data set sizes are not randomly chosen. In many common data 
science tasks maybe > 90% of what data scientists currently work 
on, people work with data sets in this range or even smaller, the 
big data stuff is much less common unless you're working for 
Google (FANGs) or a specialist startup. I remember running a 
kernel cluster in often used "data science" languages (none of 
which I'm benchmarking here) and it wasn't done after an hour and 
then hung and crashed, I implemented something in Julia and it 
was done in a minute. Calculating kernel matrices is the 
cornerstone of many kernel-based machine learning libraries 
kernel PCA, Kernel Clustering, SVM and so on. It's a pretty 
important thing to calculate and shows the potential of these 
languages in the data science field. I think an article like this 
is valid for people that implement numerical libraries. I'm also 
hoping to throw in C++ by way of comparison.