Using GPU via PyCall causes non-reusable memory allocation

It sounds like Julia’s garbage collection just isn’t running frequently enough for you, probably because Julia doesn’t know that memory is running low on the CUDA side?

You can explicitly tell Python you are done with an object o from PyCall by calling pydecref(o). (This is safe if you are done with the object: it gets mutated to a NULL object to prevent it from being decref’ed again. Perhaps the function should have been called pydecref!…) Equivalently, you can just call finalize(o).

See also stop using finalizers for resource management? · Issue #11207 · JuliaLang/julia · GitHub and `with` for deterministic destruction · Issue #7721 · JuliaLang/julia · GitHub for discussion of this general issue for resource management in Julia.