Hello!

I try to inverse large matrix on GPU and try to use this code:

```
using CUDA, LinearAlgebra
function cuinv(m::Matrix{T}) where T
if size(m, 1) != size(m, 2) throw(ArgumentError("Matrix not square.")) end
A = CuArray(m)
B = CuArray(Matrix{T}(I(size(A,1))))
A, ipiv = CUDA.CUSOLVER.getrf!(A)
Matrix{T}(CUDA.CUSOLVER.getrs!('N', A, ipiv, B))
end
m = rand(100, 100)
isapprox(cuinv(m), inv(m))
#true
```

Is it optimal solution?

I used this example for code above:

```
function Base.:\(_A, _B)
A, B = copy(_A), copy(_B)
A, ipiv = CUDA.CUSOLVER.getrf!(A)
return CUDA.CUSOLVER.getrs!('N', A, ipiv, B)
end
```