Hi all,
I am puzzled by an error I am getting when using the qr function on CUDA.
The following code works fine:
a=CUDA.randn(3,3)
q=qr(a).Q
q * a
However, the following line does not:
a * q
It results in this error:
ERROR: ArgumentError: cannot take the CPU address of a CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}
Stacktrace:
[1] unsafe_convert(#unused#::Type{Ptr{Float32}}, x::CuArray{Float32, 2, CUDA.Mem.DeviceBuffer})
[2] ormqr!(side::Char, trans::Char, A::CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, tau::CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}, C::CuArray{Float32, 2, CUDA.Mem.DeviceBuffer})
[3] rmul!(A::CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, B::LinearAlgebra.QRPackedQ{Float32, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}})
[4] *(A::CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, Q::LinearAlgebra.QRPackedQ{Float32, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}})
I would expect either both a * q and q * a to work or to result in an error. What is the cause of this and how can I execute this calculation on the GPU nevertheless?