GPU memory is managed through the GC, although indirectly: when CuArray instances go out of scope and they are collected by the Julia GC, the GPU memory refcount is lowered and freed if it drops to 0. So make sure your arrays are out of scope before calling
gc(), and make sure no other objects share the memory (eg. through a view). You can enable debug messagesthat print during finalization using
JULIA_DEBUG=CUDAdrv on 0.7, and
--compile-cache=no on 0.6.
Alternatively, you can force early collection by calling
finalize on an array. IIRC this is a pretty slow call though, and we should probably add a different early-freeing mechanic. It also won’t do anything if the buffer’s refcount hasn’t dropped to 0.
EDIT: of course, I have assumed you’re talking about CUDAdrv’s CuArray. If you’re talking about CuArrays.jl, there’s an additional level of memory pooling. It should try and free by calling
gc() once it encounters an out-of-memory error during allocation, and other than that the same rules from above (objects should be out of scope, refcounting) apply.