Custom backpropagation rule on GPU

If you put @device_code_warntype before the jacobian, you see:

PTX CompilerJob of kernel #testKernDeff(CuDeviceArray{Float32, 3, 1}, CuDeviceArray{Float32, 3, 1}, CuDeviceArray{Float32, 3, 1}, CuDeviceArray{Float32, 3, 1}, CuDeviceArray{Float32, 3, 1}, Array{Float32, 3}) for sm_75, always_inline=false

Compare that to the direct invocation of testKernDeff; there’s an Array in here, because collect always return a CPU array.

1 Like