Hi!
I stumped upon an unexpected problem with overall package/library management in julia, that I think is interesting to talk about, because I think it is quite general, and I haven’t found a similar topic.
My env:
(@v1.7) pkg> status Flux
Status `~/.julia/environments/v1.7/Project.toml`
[587475ba] Flux v0.13.4
(@v1.7) pkg> status DataFrames
Status `~/.julia/environments/v1.7/Project.toml`
[a93c6f00] DataFrames v1.3.4
(@v1.7) pkg> status CUDA
Status `~/.julia/environments/v1.7/Project.toml`
[052768ef] CUDA v3.12.0
The problem appears, when I’m using both Flux and DataFrames:
julia> using Flux
[ Info: Precompiling Flux [587475ba-b771-5e3f-ad9e-33799f191a9c]
julia> using CUDA
julia> using DataFrames
julia> m = Dropout(0.5) |> gpu
Dropout(0.5)
julia> trainmode!(m)
Dropout(0.5)
julia> m(CUDA.rand(10))
ERROR: InvalidIRError: compiling kernel rand!(CuDeviceVector{Float32, 1}, UInt32, UInt32) resulted in invalid LLVM IR
Reason: unsupported dynamic function invocation (call to CUDA.Philox2x32{R}() where R in CUDA at /home/bodo/.julia/packages/CUDA/DfvRa/src/device/random.jl:46)
Stacktrace:
[1] Philox2x32
@ ~/.julia/packages/CUDA/DfvRa/src/device/random.jl:62
[2] #default_rng
@ ~/.julia/packages/CUDA/DfvRa/src/device/random.jl:95
[3] kernel
@ ~/.julia/packages/CUDA/DfvRa/src/random.jl:41
Reason: unsupported dynamic function invocation (call to rand(rng::Random.AbstractRNG, ::Type{X}) where X in Random at /opt/julia-1.7.1/share/julia/stdlib/v1.7/Random/src/Random.jl:257)
Stacktrace:
[1] kernel
@ ~/.julia/packages/CUDA/DfvRa/src/random.jl:53
Hint: catch this exception as `err` and call `code_typed(err; interactive = true)` to introspect the erronous code
Stacktrace:
[1] check_ir(job::GPUCompiler.CompilerJob{GPUCompiler.PTXCompilerTarget, CUDA.CUDACompilerParams, GPUCompiler.FunctionSpec{CUDA.var"#kernel#320", Tuple{CuDeviceVector{Float32, 1}, UInt32, UInt32}}}, args::LLVM.Module)
@ GPUCompiler ~/.julia/packages/GPUCompiler/N98un/src/validation.jl:141
[2] macro expansion
@ ~/.julia/packages/GPUCompiler/N98un/src/driver.jl:418 [inlined]
[3] macro expansion
@ ~/.julia/packages/TimerOutputs/jgSVI/src/TimerOutput.jl:252 [inlined]
[4] macro expansion
@ ~/.julia/packages/GPUCompiler/N98un/src/driver.jl:416 [inlined]
[5] emit_asm(job::GPUCompiler.CompilerJob, ir::LLVM.Module; strip::Bool, validate::Bool, format::LLVM.API.LLVMCodeGenFileType)
@ GPUCompiler ~/.julia/packages/GPUCompiler/N98un/src/utils.jl:64
[6] cufunction_compile(job::GPUCompiler.CompilerJob, ctx::LLVM.Context)
@ CUDA ~/.julia/packages/CUDA/DfvRa/src/compiler/execution.jl:354
[7] #224
@ ~/.julia/packages/CUDA/DfvRa/src/compiler/execution.jl:347 [inlined]
[8] JuliaContext(f::CUDA.var"#224#225"{GPUCompiler.CompilerJob{GPUCompiler.PTXCompilerTarget, CUDA.CUDACompilerParams, GPUCompiler.FunctionSpec{CUDA.var"#kernel#320", Tuple{CuDeviceVector{Float32, 1}, UInt32, UInt32}}}})
@ GPUCompiler ~/.julia/packages/GPUCompiler/N98un/src/driver.jl:76
[9] cufunction_compile(job::GPUCompiler.CompilerJob)
@ CUDA ~/.julia/packages/CUDA/DfvRa/src/compiler/execution.jl:346
[10] cached_compilation(cache::Dict{UInt64, Any}, job::GPUCompiler.CompilerJob, compiler::typeof(CUDA.cufunction_compile), linker::typeof(CUDA.cufunction_link))
@ GPUCompiler ~/.julia/packages/GPUCompiler/N98un/src/cache.jl:90
[11] cufunction(f::CUDA.var"#kernel#320", tt::Type{Tuple{CuDeviceVector{Float32, 1}, UInt32, UInt32}}; name::String, kwargs::Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
@ CUDA ~/.julia/packages/CUDA/DfvRa/src/compiler/execution.jl:299
[12] macro expansion
@ ~/.julia/packages/CUDA/DfvRa/src/compiler/execution.jl:102 [inlined]
[13] rand!(rng::CUDA.RNG, A::CuArray{Float32, 1, CUDA.Mem.DeviceBuffer})
@ CUDA ~/.julia/packages/CUDA/DfvRa/src/random.jl:62
[14] _dropout_mask(rng::CUDA.RNG, x::CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}, p::Float64; dims::Function)
@ Flux ~/.julia/packages/Flux/KkC79/src/layers/normalise.jl:45
[15] #dropout_mask#318
@ ~/.julia/packages/Flux/KkC79/src/layers/normalise.jl:39 [inlined]
[16] dropout(rng::CUDA.RNG, x::CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}, p::Float64; dims::Function, active::Bool)
@ Flux ~/.julia/packages/Flux/KkC79/src/layers/normalise.jl:34
[17] WARNING: both Losses and NNlib export "ctc_loss"; uses of it in module Flux must be qualified
(::Dropout{Float64, Colon, CUDA.RNG})(x::CuArray{Float32, 1, CUDA.Mem.DeviceBuffer})
@ Flux ~/.julia/packages/Flux/KkC79/src/layers/normalise.jl:111
[18] top-level scope
@ REPL[9]:1
[19] top-level scope
@ ~/.julia/packages/CUDA/DfvRa/src/initialization.jl:52
The dropout layer suddenly stops working, but only on GPU, on CPU it works properly:
julia> m = cpu(m)
Dropout(0.5)
julia> trainmode!(m)
Dropout(0.5)
julia> m(rand(10))
10-element Vector{Float64}:
0.0
0.5422154108717265
1.415133543027147
0.0
0.7556977370953
1.022560028161302
0.5363348842605291
0.0
0.9809514073811558
0.3022707863331904
Without “using DataFrames” it works fine also on gpu:
julia> using Flux
julia> using CUDA
julia> m = Dropout(0.5) |> gpu
Dropout(0.5)
julia> trainmode!(m)
Dropout(0.5)
julia> m(CUDA.rand(10))
10-element CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}:
1.8166121
0.28154758
1.668092
1.8140966
0.020490058
0.3029256
0.0
0.29084814
0.0
0.0
It was quite surprising, and I must say, I spent some time trying to solve it. The questions are:
- Why does it happen? (I assume DataFrames is exporting maybe rand!, but why it suddenly breaks inner dependencies in Flux?)
- How to solve it? (if I needed to use both packages in one session)
- Is it a recognized issue?
- Are there guidelines on how to prevent this when building your package?