Hi, I’m using Julia 1.2.0 with the following packages installed:
(v1.2) pkg> st
Status `~/.julia/environments/v1.2/Project.toml`
[fbb218c0] BSON v0.2.4
[3895d2a7] CUDAapi v1.2.0
[c5f51814] CUDAdrv v3.1.0
[be33ccc6] CUDAnative v2.4.0
[3a865a2d] CuArrays v1.3.0
[587475ba] Flux v0.8.3
[f6369f11] ForwardDiff v0.10.6
[23992714] MAT v0.6.0
I saw this issue which was similar to mine, so I changed the CUDA related packages to have the same versions as suggested.
I’m running this on a compute cluster where cuDNN 7.4~7.6 are installed. I tried loading each of them but none successful.
For cuDNN 7.4, CUDA 10.0.130, I’m getting this error:
┌ Warning: CUDNN is not installed, some functionality will not be available.
└ @ Flux.CUDA ~/.julia/packages/Flux/qXNjB/src/cuda/cuda.jl:35
ERROR: LoadError: Your installation does not provide libcudnn, CuArrays.CUDNN is unavailable
Stacktrace:
[1] error(::String) at ./error.jl:33
[2] macro expansion at /home/jchen975/.julia/packages/CuArrays/kOUu1/src/dnn/error.jl:17 [inlined]
[3] cudnnGetProperty at /home/jchen975/.julia/packages/CuArrays/kOUu1/src/dnn/libcudnn.jl:474 [inlined]
[4] version() at /home/jchen975/.julia/packages/CuArrays/kOUu1/src/dnn/CUDNN.jl:43
[5] #conv!#22(::Int64, ::Int64, ::typeof(conv!), ::CuArray{Float32,4}, ::CuArray{Float32,4}, ::CuArray{Float32,4}, ::DenseConvDims{2,(1, 3),4,8,(1, 1),(0, 0, 1, 1),(1, 1),false}) at /home/jchen975/.julia/packages/CuArrays/kOUu1/src/dnn/nnlib.jl:44
[6] conv!(::CuArray{Float32,4}, ::CuArray{Float32,4}, ::CuArray{Float32,4}, ::DenseConvDims{2,(1, 3),4,8,(1, 1),(0, 0, 1, 1),(1, 1),false}) at /home/jchen975/.julia/packages/CuArrays/kOUu1/src/dnn/nnlib.jl:44
[7] macro expansion at /home/jchen975/.julia/packages/NNlib/mxWRT/src/conv.jl:114 [inlined]
[8] #conv#97(::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}, ::typeof(conv), ::CuArray{Float32,4}, ::CuArray{Float32,4}, ::DenseConvDims{2,(1, 3),4,8,(1, 1),(0, 0, 1, 1),(1, 1),false}) at /home/jchen975/.julia/packages/TimerOutputs/7Id5J/src/TimerOutput.jl:190
[9] #_forward#524 at /home/jchen975/.julia/packages/TimerOutputs/7Id5J/src/TimerOutput.jl:197 [inlined]
[10] _forward(::typeof(conv), ::CuArray{Float32,4}, ::TrackedArray{…,CuArray{Float32,4}}, ::DenseConvDims{2,(1, 3),4,8,(1, 1),(0, 0, 1, 1),(1, 1),false}) at ./none:0
[11] #track#1(::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}, ::typeof(Tracker.track), ::typeof(conv), ::CuArray{Float32,4}, ::Vararg{Any,N} where N) at /home/jchen975/.julia/packages/Tracker/JhqMQ/src/Tracker.jl:52
[12] track at /home/jchen975/.julia/packages/Tracker/JhqMQ/src/Tracker.jl:52 [inlined]
[13] #conv#522 at /home/jchen975/.julia/packages/Tracker/JhqMQ/src/lib/array.jl:444 [inlined]
[14] conv at /home/jchen975/.julia/packages/Tracker/JhqMQ/src/lib/array.jl:444 [inlined]
[15] (::Conv{2,2,typeof(elu),TrackedArray{…,CuArray{Float32,4}},TrackedArray{…,CuArray{Float32,1}}})(::CuArray{Float32,4}) at /home/jchen975/.julia/packages/Flux/qXNjB/src/layers/conv.jl:55
[16] applychain(::Tuple{Conv{2,2,typeof(elu),TrackedArray{…,CuArray{Float32,4}},TrackedArray{…,CuArray{Float32,1}}},Conv{2,2,typeof(elu),TrackedArray{…,CuArray{Float32,4}},TrackedArray{…,CuArray{Float32,1}}},Conv{2,2,typeof(elu),TrackedArray{…,CuArray{Float32,4}},TrackedArray{…,CuArray{Float32,1}}},getfield(Main, Symbol("##3#4")),Dense{typeof(identity),TrackedArray{…,CuArray{Float32,2}},TrackedArray{…,CuArray{Float32,1}}}}, ::CuArray{Float32,4}) at /home/jchen975/.julia/packages/Flux/qXNjB/src/layers/basic.jl:31
[17] (::Chain{Tuple{Conv{2,2,typeof(elu),TrackedArray{…,CuArray{Float32,4}},TrackedArray{…,CuArray{Float32,1}}},Conv{2,2,typeof(elu),TrackedArray{…,CuArray{Float32,4}},TrackedArray{…,CuArray{Float32,1}}},Conv{2,2,typeof(elu),TrackedArray{…,CuArray{Float32,4}},TrackedArray{…,CuArray{Float32,1}}},getfield(Main, Symbol("##3#4")),Dense{typeof(identity),TrackedArray{…,CuArray{Float32,2}},TrackedArray{…,CuArray{Float32,1}}}}})(::CuArray{Float32,4}) at /home/jchen975/.julia/packages/Flux/qXNjB/src/layers/basic.jl:33
[18] (::getfield(Main, Symbol("#pred_norm#6")))(::CuArray{Float32,4}, ::CuArray{Float32,2}) at /scratch/jchen975/train/train.jl:200
[19] train_net(::Array{Float32,3}, ::Array{Float32,2}, ::String, ::String, ::Float64, ::Int64, ::Float64, ::Int64, ::Int64) at /scratch/jchen975/train/train.jl:204
[20] train_net(::Array{Float32,3}, ::Array{Float32,2}, ::String, ::String, ::Float64, ::Int64) at /scratch/jchen975/train/train.jl:164
[21] macro expansion at ./util.jl:156 [inlined]
[22] main(::Array{String,1}) at /scratch/jchen975/train/train.jl:456
[23] top-level scope at /scratch/jchen975/train/train.jl:539
[24] include at ./boot.jl:328 [inlined]
[25] include_relative(::Module, ::String) at ./loading.jl:1094
[26] include(::Module, ::String) at ./Base.jl:31
[27] exec_options(::Base.JLOptions) at ./client.jl:295
[28] _start() at ./client.jl:464
in expression starting at /scratch/jchen975/train/train.jl:539
The code runs without error on my local machine but I only have 2GB VRAM here, so I need to run it on the cluster to see the performance for larger models.
I’ve been stuck on this for the last week or so, please help :((