CPU address error with qr function

Hi all,

I am puzzled by an error I am getting when using the qr function on CUDA.

The following code works fine:

q * a

However, the following line does not:

a * q

It results in this error:

ERROR: ArgumentError: cannot take the CPU address of a CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}
 [1] unsafe_convert(#unused#::Type{Ptr{Float32}}, x::CuArray{Float32, 2, CUDA.Mem.DeviceBuffer})
 [2] ormqr!(side::Char, trans::Char, A::CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, tau::CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}, C::CuArray{Float32, 2, CUDA.Mem.DeviceBuffer})
 [3] rmul!(A::CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, B::LinearAlgebra.QRPackedQ{Float32, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}})
 [4] *(A::CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, Q::LinearAlgebra.QRPackedQ{Float32, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}})

I would expect either both a * q and q * a to work or to result in an error. What is the cause of this and how can I execute this calculation on the GPU nevertheless?

There seems to be a missing set of methods for right multiplication, so it falls through to ordinary LinearAlgebra wrappers. You can do this until they are added:

CUDA.CUSOLVER.ormqr!('R', 'N', q.factors, q.τ, a)

Please file an issue at CUDA.jl