CUDAnative use multiple GPUs

fnoelscher · January 10, 2018, 7:57pm

Hi,

I am using julia on docker from maleadt/juliagpu and I’d like to use multiple GPUs, instead of one.
The docs mention this:

dev = CuDevice(0)
CuContext(dev) do ctx
    # allocate things in this context
    @cuda ...
end

but it does not seem to work. I have this block two times, but no matter which device number I choose, it always uses a single GPU instead. Any help is greatly appreciated.

tim.holy · January 11, 2018, 2:41am

Not sure, but you might need an @async. That do-block syntax may be blocking until the first device finishes its task.

fnoelscher · January 11, 2018, 10:28am

My code looks like this:

function calculateStuff(gpuId)
  dev = CuDevice(gpuId)
  CuContext(dev) do ctx
    @cuda (threads, blocks) expensiveFunction(...)
    synchronize()
  end
end

@spawn calculateStuff(0)
@spawn calculateStuff(1)

I execute julia with two worker processes so I guess that should do the trick? Nonetheless, only one GPU is used.

maleadt · January 11, 2018, 3:23pm

I don’t have a system with multiple GPUs, so I haven’t really worked on a decent multi-GPU API.
But what I assume is happening here, is that you aren’t executing this code in separate processes. CUDA is an API with global state, and the CuContext(dev) call sets the global context for all subsequent API calls.

Maybe try the following (again, untested, but I think it should work):

using Distributed

@everywhere using CUDAdrv, CUDAnative

@everywhere function expensiveFunction()
    # ...
end

@everywhere function calculateStuff(gpuId)
    dev = CuDevice(gpuId)
    CuContext(dev) do ctx
        return expensiveFunction()
    end
end

s1 = @spawnat 1 calculateStuff(0)
s2 = @spawnat 2 calculateStuff(1)

fetch(s1)
fetch(s2)

I’m not too familiar with Distributed, so @everyone feel free to correct my use of the library.

fnoelscher · January 11, 2018, 3:53pm

I resolved the issue. In fact, the problem was that all workers executed the same code - when you let workers preload files, they will execute everything that’s not within a function definition, for example.
I put everything GPU-related in a module and moved it to a separate file, which is loaded by each worker (-L module.jl). Then, the “main” file executes functions from the module using @spawn and everything works as expected. Thanks for the help!

floswald · March 24, 2018, 1:15pm

Hi
That sounds a lot like a problem I am having. Would you mind posting a gist with a small example of your setup? Thanks a million!

Topic		Replies	Views
Pmap with multiple GPUs GPU	8	949	October 5, 2020
How to use multiple GPUs correctly? GPU question	2	2746	October 16, 2019
Multiple GPUs - One GPU per Process - Only one GPU works on four GPU gpu	1	378	January 8, 2023
Multiple GPUs with Julia GPU announcement	8	2944	August 9, 2021
Using stream per cpu thread pattern GPU	1	901	June 8, 2019

CUDAnative use multiple GPUs

Related topics