Hi,
In hope to write a simple example for my package PseudoArcLengthContinuation.jl, I tried to look for examples of direct sparse linear solves on the GPU in the context of PDE.
So far I have not found an example where a direct method for solving A * x = rhs
(with A
being the discretisation of a differential operator) outperforms in term of computation time its CPU equivalent. I used the CUSOLVER
part of CuArrays
. I noticed that cuda 10.0
has much better performance with regard to this problem. I tried, among other things, the matrix L1
in this example.
Hence, my question is “Does anybody have an example of fast direct sparse linear solve on the GPU in the context of PDE?”
I guess I would also be happy with an iterative method simple enough to avoid the use of preconditionning.
Thank you a lot for your help,
Best