I am new to GPU parallel computing. With my knowledge of **CUSOLVER** and **CUSPARSE**, I am sure that I can complete my task through them: Solving large linear sparse equations in parallel.

I have read the documentation of `CUDA.jl`

, part of the code of `CuArray.jl`

(a bit difficult for me ), and the official manual of **CUDA**: `CUSOLVER LIBRARY`

. The following part is my code：

```
# A * x = b
n = 10
A = sprand(Float32, n, n, 0.5)
A = sparse(A*A')
d_A = CuArrays.CUSPARSE.CuSparseMatrixCSR(A)
b = rand(Float32, n)
d_b = CuArray(b)
x = zeros(Float32, n)
d_x = CuArray(x)
tol = convert(real(Float32), 1e-4)
d_x = CUSOLVER.csrlsvqr!(d_A, d_b, d_x, tol, one(Cint), 'O')
h_x = collect(d_x)
h_x ≈ Array(A)\b
> true
```

The result returned by the code is `true`

.

But as the value of `n`

increases (e.g. `n = 1000`

), the results are always `false`

. I would like to ask, why are the calculation results different?