Hi guys, currently when using the Julia backend on Google Colab, I haven’t been able to get Reactant to find any Nvidia GPUs when I select a GPU runtime on Colab. E.g. when I select A100 or the L4 GPU, and then try to run:
I’m wondering if anyone else is facing a similar issue?
As a comparison, I checked to see if at least CUDA.jl is able to find the GPU:
using Pkg
Pkg.add("CUDA")
using CUDA
println("Attempting to check CUDA functionality...")
if CUDA.functional()
println("SUCCESS: CUDA.jl is functional and a GPU is available!")
println("CUDA versioninfo():")
CUDA.versioninfo()
println("\nGPU name: ", CUDA.name(CuDevice(0)))
else
println("FAILURE: CUDA.jl is NOT functional or no GPU is available through CUDA.jl.")
try
CUDA.versioninfo()
catch e
println("Error calling CUDA.versioninfo(): ", e)
end
println("Please ensure a GPU is allocated in Colab (Runtime > Change runtime type > GPU).")
end
println("\nNow, attempting to check Reactant.jl again (after the CUDA.jl check)...")
using Reactant
try
Reactant.set_default_backend("gpu")
println("SUCCESS: Reactant.set_default_backend(\"gpu\") did NOT error this time.")
println("Reactant default backend: ", Reactant.get_default_backend())
catch e
println("FAILURE: Reactant.set_default_backend(\"gpu\") still errored.")
println("Error: ", e)
end
Here is the output of that diagnostic:
Attempting to check CUDA functionality...
SUCCESS: CUDA.jl is functional and a GPU is available!
CUDA versioninfo():
CUDA runtime 12.5, local installation
CUDA driver 12.9
NVIDIA driver 550.54.15
CUDA libraries:
- CUBLAS: 12.5.3
- CURAND: 10.3.6
- CUFFT: 11.2.3
- CUSOLVER: 11.6.3
- CUSPARSE: 12.5.1
- CUPTI: 2024.2.1 (API 23.0.0)
- NVML: 12.0.0+550.54.15
Julia packages:
- CUDA: 5.8.0
- CUDA_Driver_jll: 0.13.0+0
- CUDA_Runtime_jll: 0.17.0+0
- CUDA_Runtime_Discovery: 0.3.5
Toolchain:
- Julia: 1.10.9
- LLVM: 15.0.7
Preferences:
- CUDA_Runtime_jll.version: 12.5.1
- CUDA_Runtime_jll.local: true
1 device:
0: NVIDIA L4 (sm_89, 21.976 GiB / 22.494 GiB available)
GPU name: NVIDIA L4
Now, attempting to check Reactant.jl again (after the CUDA.jl check)...
FAILURE: Reactant.set_default_backend("gpu") still errored.
Error: ErrorException("No GPU client found")
Show thinking
Similar problem occurs when I run a different GPU e.g. an A100. I’d like to make sure this isn’t some easily-fixed issue on my end before I submit an issue on
Resolving package versions...
No Changes to `~/.julia/environments/v1.10/Project.toml`
No Changes to `~/.julia/environments/v1.10/Manifest.toml`
Precompiling packages...
2137.5 ms ✓ Reactant_jll
1 dependency successfully precompiled in 4 seconds. 458 already precompiled.
1 dependency precompiled but a different version is currently loaded. Restart julia to access the new version
1 dependency had output during precompilation:
┌ Reactant_jll
│ ┌ Debug: Detected CUDA Driver version 12.4.0
│ └ @ Reactant_jll ~/.julia/packages/Reactant_jll/ygsaO/.pkg/platform_augmentation.jl:60
│ ┌ Debug: Adding include dependency on /usr/lib64-nvidia/libcuda.so.1
│ └ @ Reactant_jll ~/.julia/packages/Reactant_jll/ygsaO/.pkg/platform_augmentation.jl:108
└
Despite that last message, Reactant.set_default_backend("gpu") still returns with:
AssertionError("Could not find registered platform with name: \"cuda\". Available platform names are: ")
No GPU client found
Stacktrace:
[1] error(s::String)
@ Base ./error.jl:35
[2] client(backend::String)
@ Reactant.XLA ~/.julia/packages/Reactant/cTiTU/src/xla/XLA.jl:82
[3] set_default_backend
@ ~/.julia/packages/Reactant/cTiTU/src/xla/XLA.jl:104 [inlined]
[4] set_default_backend(backend::String)
@ Reactant ~/.julia/packages/Reactant/cTiTU/src/Reactant.jl:293
[5] top-level scope
@ In[7]:2
Since the Reactant build I get is the one without support for CUDA, perhaps I should find a way to manually update the Julia that’s installed on the colab runtime when I start it?
Ok, I just tested in Colab myself. You need to restart the session after that, and then it should work.
My understanding is that Reactant comes preinstalled in the default environment, but they built it with a non-GPU runtime (either CPU-only, or CPU+TPU), so the default precompiled pkgimage is without GPU support, you need to force Reactant_jll to be re-precompiled.