Why is it consuming and not freeing GPU memory?

Ferran_Mazzanti · April 17, 2024, 6:34pm

Hi,
I’m running into something I do not understand. My simple CUDA code is taking
memory all the time and not freeing it. A simple example:

julia> using CUDA

julia> CUDA.memory_status()
Effective GPU memory usage: 4.16% (501.188 MiB/11.759 GiB)
Memory pool usage: 0 bytes (0 bytes reserved)

julia> kk = CUDA.rand(256,256,256)
julia> aux = CUDA.rand(256,256,256)

CUDA.memory_status()
Effective GPU memory usage: 19.58% (2.302 GiB/11.759 GiB)
Memory pool usage: 128.000 MiB (128.000 MiB reserved)

julia>for i in 1:10
    aux .= CUDA.exp.(kk)
    CUDA.memory_status()
end
Effective GPU memory usage: 19.64% (2.310 GiB/11.759 GiB)
Memory pool usage: 128.000 MiB (128.000 MiB reserved)
Effective GPU memory usage: 19.64% (2.310 GiB/11.759 GiB)
Memory pool usage: 128.000 MiB (128.000 MiB reserved)
Effective GPU memory usage: 19.64% (2.310 GiB/11.759 GiB)
Memory pool usage: 128.000 MiB (128.000 MiB reserved)
Effective GPU memory usage: 19.64% (2.310 GiB/11.759 GiB)
Memory pool usage: 128.000 MiB (128.000 MiB reserved)
Effective GPU memory usage: 19.64% (2.310 GiB/11.759 GiB)
Memory pool usage: 128.000 MiB (128.000 MiB reserved)
Effective GPU memory usage: 19.64% (2.310 GiB/11.759 GiB)
Memory pool usage: 128.000 MiB (128.000 MiB reserved)
Effective GPU memory usage: 19.64% (2.310 GiB/11.759 GiB)
Memory pool usage: 128.000 MiB (128.000 MiB reserved)
Effective GPU memory usage: 19.64% (2.310 GiB/11.759 GiB)
Memory pool usage: 128.000 MiB (128.000 MiB reserved)
Effective GPU memory usage: 19.64% (2.310 GiB/11.759 GiB)
Memory pool usage: 128.000 MiB (128.000 MiB reserved)
Effective GPU memory usage: 19.64% (2.310 GiB/11.759 GiB)
Memory pool usage: 128.000 MiB (128.000 MiB reserved)

ok that’s what I would expect, no new memory is being allocated in the GPU.
But if I run

julia> for i in 1:10
    aux .= CUDA.exp.(0.01f0*kk)
    CUDA.memory_status()
end
Effective GPU memory usage: 20.26% (2.382 GiB/11.759 GiB)
Memory pool usage: 192.000 MiB (192.000 MiB reserved)
Effective GPU memory usage: 20.79% (2.445 GiB/11.759 GiB)
Memory pool usage: 256.000 MiB (256.000 MiB reserved)
Effective GPU memory usage: 21.32% (2.507 GiB/11.759 GiB)
Memory pool usage: 320.000 MiB (320.000 MiB reserved)
Effective GPU memory usage: 21.85% (2.570 GiB/11.759 GiB)
Memory pool usage: 384.000 MiB (384.000 MiB reserved)
Effective GPU memory usage: 22.39% (2.632 GiB/11.759 GiB)
Memory pool usage: 448.000 MiB (448.000 MiB reserved)
Effective GPU memory usage: 22.92% (2.695 GiB/11.759 GiB)
Memory pool usage: 512.000 MiB (512.000 MiB reserved)
Effective GPU memory usage: 23.45% (2.757 GiB/11.759 GiB)
Memory pool usage: 576.000 MiB (576.000 MiB reserved)
Effective GPU memory usage: 23.98% (2.820 GiB/11.759 GiB)
Memory pool usage: 640.000 MiB (640.000 MiB reserved)
Effective GPU memory usage: 24.51% (2.882 GiB/11.759 GiB)
Memory pool usage: 704.000 MiB (704.000 MiB reserved)
Effective GPU memory usage: 25.04% (2.945 GiB/11.759 GiB)
Memory pool usage: 768.000 MiB (768.000 MiB reserved)

it keeps consuming memory and not freeing it. And nothing seems
yto be releasing it

julia> CUDA.memory_status()
Effective GPU memory usage: 25.52% (3.000 GiB/11.759 GiB)
Memory pool usage: 768.000 MiB (768.000 MiB reserved)

julia> CUDA.reclaim()

julia> CUDA.memory_status()
Effective GPU memory usage: 25.45% (2.993 GiB/11.759 GiB)
Memory pool usage: 768.000 MiB (768.000 MiB reserved)

this is a minimal working example, but in my iteration codes that makes my linux
box hang sometimes…

What am I doing wrong?

Thanks in advance…

tomaklutfu · April 18, 2024, 8:28am

Did you mean to do?

julia> for i in 1:10
    aux .= CUDA.exp.(0.01f0.*kk)
    CUDA.memory_status()
end

Notice .* instead of * for broadcasting.

artemsolod · April 18, 2024, 9:33am

Judging by the docs, you can try setting JULIA_CUDA_SOFT_MEMORY_LIMIT or JULIA_CUDA_HARD_MEMORY_LIMIT environment variable (not sure, but I think it needs to be set before using CUDA). Also consider calling CUDA.reclaim(), GC.gc() or manually freeing with CUDA.unsafe_free!(a).

maleadt · April 18, 2024, 10:00am

You should generally not use the same GPU for driving a display and doing computationally-intensive things. It’s actually the compute that will make the output “hang”, not the use of memory.

And regarding the use of memory, Julia is a garbage collected language, so memory will only be collected when it’s necessary.

Please don’t call CUDA.reclaim() or even GC.gc() unless really necessary, both operations will slow down your application significantly if misused. unsafe_free!ing unused memory can be good practice, but as the name implies it’s an unsafe operation so should be done with care.

maleadt · April 18, 2024, 2:22pm

You can also try Consider running GC when allocating and synchronizing by maleadt · Pull Request #2304 · JuliaGPU/CUDA.jl · GitHub, which will consider collecting memory at more points than only when running out of it.

Ferran_Mazzanti · April 18, 2024, 3:55pm

Hi,
well seems that setting the soft limits of the amount of memory to use in the GPU with
JULIA_CUDA_SOFT_MEMORY_LIMIT
solves the problem, at least in the very first tests I’m conducting now, thanks.
Best,
Ferran.

Topic		Replies	Views
Memory is not freed with CUDA and two REPLs GPU cuda	8	1520	May 7, 2021
Significant CUDA.jl memory allocations outside of main pool? GPU memory	2	1410	August 6, 2022
CUDA memory isn't freed and cannot be backtracked General Usage cuda	9	1700	August 29, 2022
Is there a way to explicitly free GPU memory? GPU	3	2622	December 15, 2019
Reseting Device GPU	20	1878	July 6, 2021

Why is it consuming and not freeing GPU memory?

Related topics