Hello,
I paste also here the issue raised here.
Describe the bug
The in-place multiplication mul!(A, B, C)
fails when A
is a vector view of a matrix.
To reproduce
The Minimal Working Example (MWE) for this bug:
using LinearAlgebra
using CUDA
CUDA.allowscalar(false)
# Working case
prova1 = CUDA.rand(Int, 100) # Vector
prova2 = CUDA.rand(Int, 200, 200) # Matrix
prova3 = CUDA.rand(Int, 200) # Vector
mul!(@view(prova1[1:10]), transpose(@view(prova2[1:10, 1:10])), @view(prova3[1:10]))
# Not working case
prova1 = CUDA.rand(Int, 100, 100) # Matrix
prova2 = CUDA.rand(Int, 200, 200) # Matrix
prova3 = CUDA.rand(Int, 200) # Vector
mul!(@view(prova1[1, 1:10]), transpose(@view(prova2[1:10, 1:10])), @view(prova3[1:10]))
Surprisingly, it works for matrix views of matrices
# Working case
prova1 = CUDA.rand(Int, 100) # Vector
prova2 = CUDA.rand(Int, 200, 200) # Matrix
prova3 = CUDA.rand(Int, 200, 200) # Matrix
mul!(@view(prova1[1:10]), transpose(@view(prova2[1:10, 1:10])), @view(prova3[1, 1:10]))