Hi,
I tried the above solution but I have a bug I cannot find. It is about using \
for transpose sparse CuArrays. I perform an iLU decomposition and then try to use the decomposition to solve linear systems on the GPU:
using SparseArrays, LinearAlgebra, IncompleteLU
n = 1000
A = I + sprand(n,n,0.01)
Precilu = ilu(A, τ = 0.1)
Now I check that my formula for the inverse works:
rhs = rand(n)
sol_0 = Precilu \ rhs
sol_1 = Precilu.U' \ ((I+Precilu.L) \ (rhs))
norm(sol_1-sol_0, Inf64)
It returns 0!! Hence, I have perform these solves in the GPU. The first one works well:
using CuArrays
CuArrays.allowscalar(false)
sol_0 = (I+Precilu.L) \ rhs
sol_1 = LowerTriangular(CuArrays.CUSPARSE.CuSparseMatrixCSR(I+Precilu.L)) \ CuArray(rhs)
norm(sol_0-Array(sol_1), Inf64)
and this returns the 4.492221705731936e-9
. However, the following fails and I don’t understand why
sol_0 = (Precilu.U)' \ rhs
sol_1 = LowerTriangular(CuArrays.CUSPARSE.CuSparseMatrixCSR(sparse(Precilu.U'))) \ CuArray(rhs)
norm(sol_0-Array(sol_1), Inf64)
and this returns 1137.9857325312835
. Maybe the fact that
rhs = rand(n)
sol_0 = Precilu \ rhs
sol_1 = LowerTriangular(Precilu.U') \ (LowerTriangular(I+Precilu.L) \ (rhs))
norm(sol_1-sol_0, Inf64)
does not return zero whereas this one does is key
rhs = rand(n)
sol_0 = Precilu \ rhs
sol_1 =(Precilu.U') \ (LowerTriangular(I+Precilu.L) \ (rhs))
norm(sol_1-sol_0, Inf64)