So, instead of giving the full code (which still, to my surprise, doesn’t work) I want to give an analysis of single pieces, to not confuse with the little wall of text.
First: I need to multiply rectangular dense matrices with sparse quare matrices.
I thought of going with mul!, but it seems I have some issues.
If I write:
function foo()
m=2
# Block_Sz1 = Array{Float64}(undef, 4,2)
Sz = [[0.5, 0] [0,-0.5]]
A = [[0,1,0,0] [0,0,3,1]]
Identity = [[1,0] [0,1]]
mul!(Block_Sz1, kronecker(Identity, Sz), A)
end
foo()
It says that Block_Sz1 is not defined.
I can define it: I tried removing the comment and it all works.
Then I tried benchmarking, but it seems that @btime doesn’t know of the existence of Block_Sz1, while @time succeds.
If I try to allocate Block_Sz1 for the sake of @btime (but still, running without using it), mul! throws a big error:
LoadError: InexactError: Int64(-0.5) (...)
Don’t know why, cause the documentation of mul! it says:
Calculates the matrix-matrix or matrix-vector product ABAB and stores the result in Y, overwriting the existing value of Y. Note that Y must not be aliased with either A or B.
However, we can proceed further cause i have sparse matrices.
Is mul! the generalized one or sparse one too?
I found on internet this Mul! vs SparseMatrix
Which says that mul! is generic, while spmatmul! is the sparse specific one, however in the sparseArrays documentation I can’t find anything related to the keyword “mul”.
We can try mul! anyway.
From the previous example I understood I first need to preallocate the array as undefined, but I can’t create an undefined sparse array, so I will go with a dense one instead.
It ends up spending 5 times the memory and 4 times the time:
function foo()
m=2
Block_Sz1 = Array{Float64}(undef, 4,2)
Sz_sparse = sparse([[0.5, 0] [0,-0.5]])
A_sparse = sparse([[0,1,0,0] [0,0,3,1]])
@time mul!(Block_Sz1, kronecker(sparse(I, m, m), Sz_sparse), A_sparse)
end
foo()
Even If I can manage to solve these issues, there are still uncertainties:
- reading on internet i need to preallocate to help the compiler, and it’s obviously required. I will try to experiment on this subject after having a functional program within the julia semantics.
- In the end I will do a triple product like O’ A O where ’ is the transposition, A is square and sparse, O is rectangular and dense. I’ve seen there are tools like octave and tullio who could help optimizing the performances
- i still wonder if it’s possible, after all these steps and related to step 1, to preallocate these sparse matrices before the loop and viewing slices of them. But i think it will come at the end of all this.