I was wondering if it’s a good practice to always copy GPU arrays back to the CPU before a program ends.
I’m running a large program that performs simulations with a large amount of data on the GPU. I know that some results need to be presented on CPU - so they must be copied back. But what about the rest of the data? Would it be good to copy everything back to the CPU as well? I have a lot of data, so this could introduce some overhead. But I’m also concerned that not copying them back might cause other issues. Does anyone with experience in this area have any advice?
Thanks for your reply! I’m not sure - there might be some issues with running out of GPU memory before the program ends? But that seems like a very rare case.
Copying data from the GPU to the CPU is not typically a manual way to free GPU memory. I don’t know which GPU library you’re using, but memory is likely to be automatically managed like in base Julia.
To add to @Benny’s reply, copying data from the GPU to the CPU (using e.g. copyto!(cpu_array, gpu_array) or Array(gpu_array)) does not in fact free any memory on the GPU, since it just, well…, copies . Afterwards, the data will still be on the GPU (and now also on the CPU).
In methods where I allocate a lot in a loop, I do find it useful to manually free memory using CUDA.unsafe_free!(gpu_array). In general, take a look at the page on Memory management in the CUDA.jl documentation. Presumably the other GPU libraries have similar functionality.