Hello all. I hit an unexpected error writing CUDA kernel.
Somehow the variable b
seems to be in conflict.
When I removed b=sum(A)
, this function worked.
Is it a spec or a bug?
using CUDA
function main()
function kernel(A, B)
i = threadIdx().x
b = B[i]
A[i] = b
return nothing
end
A = CUDA.zeros(1)
B = CUDA.zeros(1)
CUDA.@cuda kernel(A, B)
b = sum(A)
end
main()
ERROR: GPU compilation of MethodInstance for (::var"#kernel#11")(::CuDeviceArray{Float32, 3, 1}, ::CuDeviceArray{Float32, 3, 1}) failed
KernelError: passing and using non-bitstype argument
Argument 1 to your kernel function is of type var"#kernel#11", which is not isbits:
.b is of type Core.Box which is not isbits.
.contents is of type Any which is not isbits.