I recently updated my system and Julia and now the following program fails (it didn’t fail before the updates):
using Flux
using MLDatasets
using CUDA
using Random
a = CuArray{Float32}(undef, 2)
Random.rand!(CUDA.default_rng(), a)
I get this error:
InvalidIRError: compiling kernel rand!(CuDeviceVector{Float32, 1}, UInt32, UInt32) resulted in invalid LLVM IR
Reason: unsupported dynamic function invocation (call to CUDA.Philox2x32{R}() where R in CUDA at ~/.julia/packages/CUDA/AHr5I/src/device/random.jl:46)
[1] Philox2x32
@ ~/.julia/packages/CUDA/AHr5I/src/device/random.jl:62
[2] #default_rng
@ ~/.julia/packages/CUDA/AHr5I/src/device/random.jl:95
[3] kernel
@ ~/.julia/packages/CUDA/AHr5I/src/random.jl:39
Reason: unsupported dynamic function invocation (call to rand(rng::AbstractRNG, ::Type{X}) where X in Random at /usr/share/julia/stdlib/v1.7/Random/src/Random.jl:257)
[1] kernel
@ ~/.julia/packages/CUDA/AHr5I/src/random.jl:51
Hint: catch this exception as `err` and call `code_typed(err; interactive = true)` to introspect the erronous code
[1] check_ir(job::GPUCompiler.CompilerJob{GPUCompiler.PTXCompilerTarget, CUDA.CUDACompilerParams, GPUCompiler.FunctionSpec{CUDA.var"#kernel#356", Tuple{CuDeviceVector{Float32, 1}, UInt32, UInt32}}}, args::LLVM.Module)
@ GPUCompiler ~/.julia/packages/GPUCompiler/1FdJy/src/validation.jl:124
[2] macro expansion
@ ~/.julia/packages/GPUCompiler/1FdJy/src/driver.jl:386 [inlined]
[3] macro expansion
@ ~/.julia/packages/TimerOutputs/LDL7n/src/TimerOutput.jl:252 [inlined]
[4] macro expansion
@ ~/.julia/packages/GPUCompiler/1FdJy/src/driver.jl:384 [inlined]
[5] emit_asm(job::GPUCompiler.CompilerJob, ir::LLVM.Module; strip::Bool, validate::Bool, format::LLVM.API.LLVMCodeGenFileType)
@ GPUCompiler ~/.julia/packages/GPUCompiler/1FdJy/src/utils.jl:64
[6] cufunction_compile(job::GPUCompiler.CompilerJob, ctx::LLVM.Context)
@ CUDA ~/.julia/packages/CUDA/AHr5I/src/compiler/execution.jl:332
[7] #260
@ ~/.julia/packages/CUDA/AHr5I/src/compiler/execution.jl:325 [inlined]
[8] JuliaContext(f::CUDA.var"#260#261"{GPUCompiler.CompilerJob{GPUCompiler.PTXCompilerTarget, CUDA.CUDACompilerParams, GPUCompiler.FunctionSpec{CUDA.var"#kernel#356", Tuple{CuDeviceVector{Float32, 1}, UInt32, UInt32}}}})
@ GPUCompiler ~/.julia/packages/GPUCompiler/1FdJy/src/driver.jl:74
[9] cufunction_compile(job::GPUCompiler.CompilerJob)
@ CUDA ~/.julia/packages/CUDA/AHr5I/src/compiler/execution.jl:324
[10] cached_compilation(cache::Dict{UInt64, Any}, job::GPUCompiler.CompilerJob, compiler::typeof(CUDA.cufunction_compile), linker::typeof(CUDA.cufunction_link))
@ GPUCompiler ~/.julia/packages/GPUCompiler/1FdJy/src/cache.jl:90
[11] cufunction(f::CUDA.var"#kernel#356", tt::Type{Tuple{CuDeviceVector{Float32, 1}, UInt32, UInt32}}; name::String, kwargs::Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
@ CUDA ~/.julia/packages/CUDA/AHr5I/src/compiler/execution.jl:297
[12] macro expansion
@ ~/.julia/packages/CUDA/AHr5I/src/compiler/execution.jl:102 [inlined]
[13] rand!(rng::CUDA.RNG, A::CuArray{Float32, 1, CUDA.Mem.DeviceBuffer})
@ CUDA ~/.julia/packages/CUDA/AHr5I/src/random.jl:60
[14] top-level scope
@ ~/tmp/test.jl:7
in expression starting at ~/tmp/test.jl:7
If I remove using Flux
, using MLDatasets
or both then it works. The error only appears if I use both Flux
and MLDatasets
And if I run the Random.rand!
command before and after these two using
then it also works without any error.
I don’t understand what’s happening, and I don’t know if it’s a bug in CUDA, Flux or MLDatasets. Where should I report that?
For now I’ll just run the rand!
command before everything else in my code, but a real fix would be better.
julia> versioninfo()
Julia Version 1.7.2
Commit bf53498635 (2022-02-06 15:21 UTC)
Platform Info:
OS: Linux (x86_64-pc-linux-gnu)
CPU: Intel(R) Core(TM) i9-9900K CPU @ 3.60GHz
LIBM: libopenlibm
LLVM: libLLVM-12.0.1 (ORCJIT, skylake)
julia> CUDA.versioninfo()
CUDA toolkit 11.6, artifact installation
NVIDIA driver 510.68.2, for CUDA 11.6
CUDA driver 11.6
- CUBLAS: 11.8.1
- CURAND: 10.2.9
- CUFFT: 10.7.0
- CUSOLVER: 11.3.2
- CUSPARSE: 11.7.1
- CUPTI: 16.0.0
- NVML: 11.0.0+510.68.2
- CUDNN: 8.30.2 (for CUDA 11.5.0)
- CUTENSOR: 1.4.0 (for CUDA 11.5.0)
- Julia: 1.7.2
- LLVM: 12.0.1
- PTX ISA support: 3.2, 4.0, 4.1, 4.2, 4.3, 5.0, 6.0, 6.1, 6.3, 6.4, 6.5, 7.0
- Device capability support: sm_35, sm_37, sm_50, sm_52, sm_53, sm_60, sm_61, sm_62, sm_70, sm_72, sm_75, sm_80
1 device:
0: NVIDIA GeForce GTX 1080 (sm_61, 3.941 GiB / 8.000 GiB available)
I also tried with CUDA master and the problem is still there.