Avoiding Memory leaks using CuArrays

Hello,

I have read that there is no garbage collection in GPUs. I have an algorithm where I use an Array{Array{CuArrays{Float32,1},1}} as a buffer. From that I sample a batch that I transform in a NTuple{CuArrays, N} to feed a training loop in Flux with. The buffer has a fixed size and I regularly generate new elements (Array{CuArrays{Float32,1},1}) to replace the old ones. I think that when I do that I replace pointers to CuArrays with other ones without freeing the Vmemory. I know that a solution is to bring the CuArray back to the cpu and then erase it but I think that’s highly inefficient and very slow. My first question is: is there a way to free that memory without the costly transfer to cpu ?

Here’s an example:

buffer = Array{Array{CuArrays{Float32,1},1}}()
... populate the buffer to its fixed size.

x::Array{CuArray{Float32,1},1}()
y::Array{CuArray{Float32,1},1}()#(say that N = 2)

batch = rand(buffer, 20)
for element in batch
    push!(x, element[1])
    push!(y, element[2])
end
x = Flux.batch(x) # produces a CuArray{Float32,2}
y = Flux.batch(y) # produces a CuArray{Float32,2}
data = (x,y)
... train a network on data
newElement::Array{CuArrays{Float32,1},1} = generateanewelement()
push!(buffer, newElement)
popfirst!(buffer) #the popped element is an array of pointers, the VRAM is not freed 

I though that I could simply overwrite the old element with the new at the same location (which would be the most efficient way to go). Say I do that this way:

function overwriteoldelement(buffer, indexofoldest)
    buffer[indexofoldest][1] = generatenewX()
    buffer[indexofoldest][2] = generatenewY() #these two output CuArrays
end

I don’t think this overwrites the memory, it simply changes the pointer or something like that right ?

Do you observe an OOM situation? CuArray is garbage collected, so if there’s no way to access a CuArray object on the host, it’ll get collected eventually. Objects are not kept alive by having been sent to the GPU, so you don’t need to transfer back. If you want to eagerly free memory, call CuArrays.unsafe_free!(::CuArray).

2 Likes

Hello,

Yes I do. Try this simple code

using CuArrays, Flux
buffer = []
while true
    push!(buffer, gpu(rand(500000)))
    if length(buffer) > 500 
        popfirst!(buffer) 
     end
end

This quickly makes an out of memory exception. It’s clearly a memory leak since even after the julia process encounters the exception my VRAM is not freed according to the task manager.

It does not happen if I call unsafe_free!() before popping though.

1 Like

Hi @maleadt,

During my computations I have this error message from time to time.

error in running finalizer: AssertionError(msg="Release of dead CUDAdrv.Mem.Buffer(CUDAdrv.CuPtr{Nothing}(0x000000090367fc00), 32, CUDAdrv.CuContext(Ptr{Nothing}
@0x000000002106eda0, false, true))")

It does not throw an exception and my computation goes on. It does not happen consistently either. I think this is related to the unsafe_free function that I use. Do you know what this means ?