Heya,
I am doing some markov process simulations which output NxN matrices these are stored (currently) in a 3D array X[N, N, M) where M can be very large (1E6 or more).
My code currently uses a reduce statement reduce(*, eachslice(X; dims=3))
to get the overall matrix product – a single NxN matrix.
In the long term N could be quite large and in the long-long term these will be varying size rectangular matrices (M x N) * (N x P) * (P x Q) * … * (U x V) → (M x V)
Question 1: Would it be better to take a running product? I tried doing this straightforwardly with a loop and the time equivalent and allocations are double just storing the whole of X in memory and then reducing at the end. But perhaps I am doing it poorly?
using BenchmarkTools
using LinearAlgebra
function store_mem(N, G)
X = zeros(N, N, G)
for i in 1:G
C = rand(N, N)
X[:, :, i] = C ./ sum(C, dims=1) # stochastic matrix
end
return reduce(*, eachslice(X; dims=3))
end
function store_inplace(N, G)
X = Matrix(I, N, N)
for i in 1:G
C = rand(N, N)
X *= C ./ sum(C, dims=1) # stochastic matrix
end
return X
end
mem_bench = @benchmark store_mem(10, 1000000)
inplace_bench = @benchmark store_inplace(10, 1000000)
Question 2: A friend mentioned this was reminiscent of tensor contractions so I tried to use the Tullio.jl package but I am finding the notation quite confusing for the macro. My attempt (MWE below) I seem to be getting the multiplication of each entry not the multiplication of the matrices. I have looked over the list of example calls in the repo but I can’t seem to find an equivalent contraction. I assume if I go this route the advantage would disappear once the matrices become variable in size or is there a way to cope with this?
using Tullio
c = rand(3, 3, 3)
truth = c[:, :, 1] * c[:, :, 2] * c[:, :, 3];
display(truth)
direct = reduce(*, eachslice(c; dims=3));
display(direct)
@tullio (*) einstein[i, j] := c[i, j, k];
display(einstein)