Memory is not freed with CUDA and two REPLs

Memory is being cached by the CUDA stream-ordered allocator for future reuse. This isn’t compatible with using multiple instances of Julia using the same GPU. If you really need the memory, you can run with JULIA_CUDA_MEMORY_POOL=none, but this is obviously going to hurt performance.

2 Likes