Example of direct linear solve, Sparse Matrix in cuda



In hope to write a simple example for my package PseudoArcLengthContinuation.jl, I tried to look for examples of direct sparse linear solves on the GPU in the context of PDE.

So far I have not found an example where a direct method for solving A * x = rhs (with A being the discretisation of a differential operator) outperforms in term of computation time its CPU equivalent. I used the CUSOLVER part of CuArrays. I noticed that cuda 10.0 has much better performance with regard to this problem. I tried, among other things, the matrix L1 in this example.

Hence, my question is “Does anybody have an example of fast direct sparse linear solve on the GPU in the context of PDE?”

I guess I would also be happy with an iterative method simple enough to avoid the use of preconditionning.

Thank you a lot for your help,




If your matrix is SPD, IterativeSolvers.cg supports arbitrary array types. You can use CuVector for the rhs and CuArrays.CUSPARSE.CuSparseMatrixCSC for the matrix or the csr equivalent.



Thank you! I looked at the test example for cg and for A = laplace_matrix(Float64, 1000, 2) there is no CV convergence in 1000 steps. Also, the preconditionner is not improving this…