I have a problem where I have an “embarrassingly” parallel task which uses the GPU. This task would be able to run fine on a GPU with ~3GB of memory, but I have a 3090ti with 24GB of RAM. I want to parallelise this task across the GPU simply by using the “pmap” function from the Distributed package. The issue is that with any number of processes > 1, the GPU memory fills up so quickly and then errors. Is there any way to restrict the amount of GPU memory that each process can use, so that the GC get’s triggered more often?
Multiple processes are going to compete for GPU memory, and currently NVIDIA does not offer a feature to limit the memory use, so there isn’t much we can do (apart from completely taking memory management into our own hands, which I’m not planning to). Currently, processes are only meant to be used with GPU programming when you can allocate a single device per process.
As an alternative for your situation, can’t you use threads? Those share the memory allocated on the GPU.