CuArray local scope memory issue

As @jpsamaroo said, you should put the code inside a function. You can also do this with preallocating memory too. You can do operations in-place with .= broadcasting. This function should work for GPU arrays and normal arrays:

function f!(c, a, b)
    c .= (a.==true) .&& (b.==true)
end

This can be benchmarked:

using CUDA
using BenchmarkTools
n=1024
a=CUDA.rand(Bool, n)
b=CUDA.rand(Bool, n)
c=similar(a)
@btime CUDA.@sync f!($c, $a, $b)

This shouldn’t have memory problems or require the GC.

Also, as a PS, could you simplify to the following:

function f!(c, a, b)
    c .= a .&& b
end
1 Like