I am new to GPU parallel computing. With my knowledge of CUSOLVER and CUSPARSE, I am sure that I can complete my task through them: Solving large linear sparse equations in parallel.
I have read the documentation of
CUDA.jl, part of the code of
CuArray.jl (a bit difficult for me ), and the official manual of CUDA:
CUSOLVER LIBRARY. The following part is my code：
# A * x = b n = 10 A = sprand(Float32, n, n, 0.5) A = sparse(A*A') d_A = CuArrays.CUSPARSE.CuSparseMatrixCSR(A) b = rand(Float32, n) d_b = CuArray(b) x = zeros(Float32, n) d_x = CuArray(x) tol = convert(real(Float32), 1e-4) d_x = CUSOLVER.csrlsvqr!(d_A, d_b, d_x, tol, one(Cint), 'O') h_x = collect(d_x) h_x ≈ Array(A)\b > true
The result returned by the code is
But as the value of
n increases (e.g.
n = 1000), the results are always
false. I would like to ask, why are the calculation results different?