Hi everyone,
Based on my last question here and from what I understand in the performance tips section, Julia is a column major and I should try to (1) start the most inner loop by column and (2) fill pre-allocated output by column if possible.
I tried to verify this for myself by an example using the Distributions package, but I did not see any performance discrepancy of row- or columnwise allocation. Does this only occur with double loops normally? I would actually rather write code row-by-row normally as time-series data is usually stated like that, but if performance is an issue I can also just transpose data and write my functions col-by-col.
Based on my limited understanding, the code I wrote seems to be type stable (at least @code_warntype did not show any ::Any types), here is the code I used:
#Import modules
using Distributions
using BenchmarkTools
function fun_rowwise(t::Int, distr)
x = zeros(Float64, (t, Distributions.length(distr)) )
for i in 1:size(x,1)
x[i,:] = rand(distr, 1)
end
x
end
function fun_colwise(t::Int, distr)
x = zeros(Float64, (Distributions.length(distr), t) )
for i in 1:size(x,2)
x[:,i] = rand(distr, 1)
end
x
end
#Assign time index and distributions
d_univ = Normal(0., 1.)
d_multiv = MvNormal([0., 0., 0., 0., 0.],[1., 1., 1., 1., 1.])
t = 10^6
fun_rowwise(t, d_univ)
fun_rowwise(t, d_multiv)
fun_colwise(t, d_univ)
fun_colwise(t, d_multiv)
@code_warntype fun_rowwise(t, d_univ)
@code_warntype fun_colwise(t, d_univ)
@benchmark fun_rowwise(t, d_univ)
@benchmark fun_colwise(t, d_univ)
@benchmark fun_rowwise(t, d_multiv)
@benchmark fun_colwise(t, d_multiv)