Article: Interfacing D with C and Fortran

Mon Apr 17 02:32:37 PDT 2017

On Friday, 14 April 2017 at 17:55:54 UTC, jmh530 wrote:
> On Thursday, 13 April 2017 at 11:23:32 UTC, jmh530 wrote:
>
> Just an FYI, I was looking at another post
>
> http://www.active-analytics.com/blog/fitting-glm-with-large-datasets/
>
> and the top part is a little confusing because the code below 
> switches it up to do CC=BB*AA instead of CC=AA*BB.
>
> If I'm understanding it correctly, you originally have an mXn 
> matrix times an nXp matrix, then you partition the left hand 
> side to be mXk and the right hand to kXp and loop through and 
> add them up. However, at the top you say that A (which at the 
> top is the left hand variable) is split up by rows. However, 
> the code clearly splits the left hand side (B here) by columns 
> (BB is 5X100 and B is a 10-dimensional list of 5X10 matrices).

Sorry, I didn't see your question until now. That article was 
something I worked on years earlier. The main principle is that 
you split and aggregate over repeated indices. The code is 
intended to be illustrative of the principle. Don't get too hung 
up with equating the the code symbols with equation - the 
principle is the main thing. I wrote an R package where the 
important bits is written in C++: 
https://cran.r-project.org/web/packages/bigReg/index.html using 
the principle in GLM

MORE IMPORTANTLY, however is that that algorithm is not 
efficient! At least not as efficient as gradient descent or even 
better stochastic gradient descent or their respective 
modifications.