Question about garbage collection, AD, and CUDA

kcqpo · December 15, 2022, 10:12pm

I encountered an issue that is in this thread Allocator very slow to reclaim memory after running for sufficiently long · Issue #137 · JuliaGPU/CUDA.jl · GitHub. For a machine learning workload, I noticed that if I don’t manually disable and enable the GC in Julia while using the GPU (i.e. disable GC before using the GPU and enabling after GPU is done working), the code significantly slows down.

Is this safe with regards to memory leak and segfaults? The site says that manual memory management is not needed, so I was hoping I don’t have to code at a lower level. Memory management · CUDA.jl.
What libraries allow reverse automatic differentiation of a cost function involving a derivative obtained using another automatic differentiation library or finite differencing? I looked through threads and see that nested AD may not be supported right now.

maleadt · December 19, 2022, 9:05pm

It should be. You can of course run out of memory during the time when the GC was disabled, but when you re-enable it it will also collect memory that was allocated during that time.

wrt. the manual memory management, you can always help the GC a little by adding calls to CUDA.unsafe_free! where possible. This will also help if you’re already disabling the GC (because memory can be reused more quickly).

Topic		Replies	Views
Reseting Device GPU	20	1878	July 6, 2021
Why is it consuming and not freeing GPU memory? GPU	5	466	April 18, 2024
Manually Trigger Mark Phase of GC General Usage	7	161	March 1, 2025
CUDA memory isn't freed and cannot be backtracked General Usage cuda	9	1698	August 29, 2022
Is it possible to manually manage the memory without having to use C Standard Library? General Usage	15	1432	November 8, 2022

Question about garbage collection, AD, and CUDA

Related topics