Thank you, I have read the docs for GC.@preserve
and this interesting discussion: On the garbage collection. I hope now I understand what causes the error in my case. However, it is not clear for me how I should use GC.@preserve
in case of array of arrays. Here I prepared a much more simple MWE:
using CUDA
function kernel(a, bb)
id = threadIdx().x
a[id] = sum(bb[id])
return nothing
end
N = 10
a = CUDA.zeros(N)
b = Array{CuArray}(undef, N)
for i=1:N
b[i] = CUDA.ones(2)
end
# This potentially can cause an error
bb = CuArray([cudaconvert(b[i]) for i=1:N])
@cuda threads=N kernel(a, bb)
# Option 1:
bb = CuArray([cudaconvert(b[i]) for i=1:N])
GC.@preserve b begin
@cuda threads=N kernel(a, bb)
end
# Option 2:
btmp = [cudaconvert(b[i]) for i=1:N]
bb = CuArray(btmp)
GC.@preserve btmp begin
@cuda threads=N kernel(a, bb)
end
In this example I do not understand what exactly I should preserve: the original array of CuArrays b
, the temporary array of CuDeviceArrays [cudaconvert(b[i]) for i=1:N]
, or both of them.
Sorry, it is very hard to debug such code, since GC manages to collect the original objects only if the kernel has been running long enough. Can I mimic the GC behaviour and cause the error by myself using e.g. something like finilize(b)
?