Checking that work is being sent to processors: GPU vs Multiple CPUs

That’s really odd. The REPL isn’t showing you anything? If you run other GPU codes, is it fine?

Using watch -n 1 nvidia-smi might be useful to keep it going.

When CUDA loads it downloads and installs the toolkit. You might want to do this manually first to be sure it works:

using CUDA
CUDA.versioninfo()

This should download everything you need.

This helped a little: I see that the GPU’s fans are turning on as a result some usage. But I see nothing in the processes section. Should I expect to see something here?

Is there some example julia code floating around where I would expect to see something populating in the processes section in the nvidia-smi tool? Sorry, I’m new to all this. I’m just relying on script kiddie luck and matlab experience to get me through right now.

Did you run the first CUDA tutorial?

https://cuda.juliagpu.org/dev/tutorials/introduction/#Your-first-GPU-computation

You should be able to see utilization from just a simple matmul.

using CUDA
A = cu(rand(1000,1000))
B = cu(rand(1000,1000))
A*B
1 Like

The only indication that anything is occurring is that the GPU fan turns on, the temp goes up, and the power usage increases. I don’t see anything in the processes section though. I’m running watch -n .2 nvidia-smi.

Ya – I went through this tutorial and watch some of the youtube videos online.

Does the performance of the matmul increase? Like by orders of magnitude.

yes

maybe this is something about how the integrator is programmed. are processes running so fast that they’re not being picked up in nividia-smi?

Well, could be, nvidia-smi probably also has a certain sampling rate. In any case, the performance increase tells you that it’s running on the GPU. Otherwise, if you only want to run “something” on the GPU and see it pop up in nvidia-smi you could try to run a GPU stresstest with GPUInspector.jl

If you are running the examples in the REPL and look at nvidia-smi in a different process concurrently then a process of julia should show up, even if it is not processing at the time, since it still has memory allocated.

This is strange. Using the stresstest, I still see nothing in nvidia-smi’s processes.

However, when I look at the reporting tools that come with GPUInspector, I can see the GPU is being utilized. Its reporting some of the metrics that I mentioned before (power, temperature, fan use). It also shows GPU% utilization which appears non-zero (not the case in nvidia-smi – see the screenshot above).

I’m not sure why nvdia-smi isn’t reporting these processes. Do you see them when you run stresstest?

Update: I ran the monitor commands from GPUInspector and wrapped them around the ODEcomputations that I was originally trying to run and still saw 0% GPU utilization, until the very end of the computation.

Ya that’s strange, I see nothing. What do you see in nvidia-smi when you run julia?