Hi all,
I have a partial differential equation to solve using finite difference. I am trying to understand if it is better to use a vectorized version or a loop version to calculate the right hand side of the equation, which has the spatial discretization. What I am struggling to understand is that for a given grid size (e.g. number of points at which the RHS is evaluated) the vector version performs more poorly than the loop version.
However if  I increase the grid size the difference in performance between the two seems to become less.
In order to improve my understanding if Julia performs better with loop or vectorized code I wrote the following code
function vect_sum(A,B)                                                                                                                                                                                                                                         
    C .= A + B;                                                                                                                                                                                                                                                
    return C                                                                                                                                                                                                                                                   
end                                                                                                                                                                                                                                                            
                                                                                                                                                                                                                                                               
function loop_sum(A,B,C,m,n)                                                                                                                                                                                                                                   
    for i=1:m                                                                                                                                                                                                                                                  
        for j=1:n                                                                                                                                                                                                                                              
            C[i,j] = A[i,j] + B[i,j]                                                                                                                                                                                                                           
        end                                                                                                                                                                                                                                                    
    end                                                                                                                                                                                                                                                        
    return C                                                                                                                                                                                                                                                   
end                                                                                                                                                                                                                                                            
n= 1000                                                                                                                                                                                                                                                        
m = 1000                                                                                                                                                                                                                                                       
A = rand(m,n)                                                                                                                                                                                                                                                  
B = rand(m,n)                                                                                                                                                                                                                                                  
C= zeros(m,n);                                                                                                                                                                                                                                                 
@btime vect_sum(A,B);                                                                                                                                                                                                                                          
C= zeros(m,n);                                                                                                                                                                                                                                                 
@btime loop_sum(A,B,C,m,n);            
Here, despite the common sense that loop in Julia are faster, I am getting that the vector version of the code perform better than the loop.
Does anybody have an hint on why this is happening?
Thanks
