Efficient code

performance

#1

Setup for MWE

using Distributions
qidx = collect(1:10)
x = rand([0,1], 10)
theta = randn()
gamma=randn(2)
d = randn(10)
A = [.3 .7; .2 .8]
p=[.8, .2]

I’d like to run the following code, but I’m worried whether I’ve written this efficiently. For my example m will stay as 2 but n will be much larger.

This is forward-backward algo for HMMs in matrix notation but with a time varying emission matrix.

 m = length(p)                                                      
 n = length(x)                                                      
 G = zeros(n, m)                                                    
 l = zeros(m)                                                       
 c = zeros(n)                                                       
                                                                    
 B = zeros(m, m, n)                                                 
  for j in 1:n                                      
   s = cdf(Logistic(theta - gamma[2]), d[qidx[j]]) 
   g = cdf(Logistic(d[qidx[j]] - gamma[1]), theta) 
   if x[j] == 1                                    
     B[:, :, j] = diagm([1.0 - s, g])              
   else                                            
     B[:, :, j] = diagm([s, 1.0 - g])              
   end                                             
 end                                               
                                                   
 G[1,: ] = B[:,:,1]'p                              
                                                   
 for j in 2:n                                      
   G[j,: ] = G[j-1,: ]' * A * B[:,:,j]             
 end                                               

In particular I am interested in whether the way I have allocated the B matrix is good (the last dimension is what is looped over), would it be better if it was the first dimension? I always get confused with which dimension should be looped over first, second, etc.


EDIT:
I’ve changed B so that the first dimension is used in the innermost loop.

 B = zeros(n, m, m)                                     
                                                        
 for j in 1:n                                           
   s = cdf(Logistic(theta - gamma[2]), d[qidx[j]])      
   g = cdf(Logistic(d[qidx[j]] - gamma[1]), theta)      
   if x[j] == 1                                         
     B[j,:, : ] = diagm([1.0 - s, g])                   
   else                                                 
     B[j,:,: ] = diagm([s, 1.0 - g])                    
   end                                                  
 end                                                    
                                                        
 G[1,: ] = B[1,:,: ]'p                                  
                                                        
 for j in 2:n                                           
   G[j,: ] = G[j-1,: ]' * A * B[j,:,: ]                 
 end                                                                                 

#2

The first dimension should be used for innermost loops with Julia’s column-major storage.