After a fresh installation of CUDA.jl 1.1/1.0, we got the following error on a cluster when trying to precompile a package that has CUDA in its dependencies:
[ Info: Precompiling ParallelStencil [94395366-693c-11ea-3b26-d9b7aac5d958]
Downloading artifact: CUDNN_CUDA102
Downloading artifact: CUTENSOR_CUDA102
ERROR: LoadError: LoadError: LoadError: could not load library "/home/lraess/.julia/artifacts/fbe34931d3c1bebd56fbc2edba0f8ece5295fed7/lib/libcutensor.so"
/lib64/libc.so.6: version `GLIBC_2.14' not found (required by /home/lraess/.julia/artifacts/fbe34931d3c1bebd56fbc2edba0f8ece5295fed7/lib/libcutensor.so)
Stacktrace:
NOTE 1: the package does not use cutensors.
NOTE 2: CUDA was installed as follows:
module load CUDA/10.0
export JULIA_CUDA_USE_BINARYBUILDER=false
julia
] add CUDA
We would appreciate quick help very much as we need to run something there for the JuliaCon video!
Thanks for your reply @giordano. Yes, it looks like the libc.so.6 that is found is too old. There is another one available on the cluster and when I set the path to it in LD_LIBRARY_PATH there is no more libc.so.6-error given when doing ldd on libcutensor.so.
However with that exported julia cannot be started: it segfaults at startup. Also defining the LD_LIBRARY_PATH inside julia using ENV did not work. It gave the following error:
julia> ENV["LD_LIBRARY_PATH"] = "/soft/glibc/glibc-2.17/lib:$(ENV["LD_LIBRARY_PATH"])"
"/soft/glibc/glibc-2.17/lib:/soft/glibc/glibc-2.17/lib:\$LD_LIBRARY_PATH"
julia> using ParallelStencil
[ Info: Precompiling ParallelStencil [94395366-693c-11ea-3b26-d9b7aac5d958]
ERROR: IOError: write: broken pipe (EPIPE)
Stacktrace:
[1] uv_write(::Base.PipeEndpoint, ::Ptr{UInt8}, ::UInt64) at ./stream.jl:953
[2] unsafe_write(::Base.PipeEndpoint, ::Ptr{UInt8}, ::UInt64) at ./stream.jl:1007
[3] write(::Base.PipeEndpoint, ::String) at ./strings/io.jl:183
[4] create_expr_cache(::String, ::String, ::Array{Pair{Base.PkgId,UInt64},1}, ::Base.UUID) at ./loading.jl:1176
[5] compilecache(::Base.PkgId, ::String) at ./loading.jl:1261
[6] _require(::Base.PkgId) at ./loading.jl:1029
[7] require(::Base.PkgId) at ./loading.jl:927
[8] require(::Module, ::Symbol) at ./loading.jl:922
Yeah, I wouldn’t expect changing this variable inside Julia to be too much helpful. Out of curiosity, do you have any CUDA module available on the cluster? What version? Maybe you can try with an older version of CUDA provided by Julia.
If that’s the case you can use JULIA_CUDA_USE_BINARYBUILDER=false.
If you want to keep using artifacts, you can use an older version that does not provide CUTENSOR by specifying JULIA_CUDA_VERSION=.... Or you could disable the CUTENSOR version check in __runtime_init__ (in CUDA.jl’s initialization.jl), it’s that check that triggers use of the library.
@maleadt, a small related question: do I assume right that CUDA-aware MPI will not be able to work when one uses artifacts, right? For CUDA-aware MPI, one needs to have a CUDA-aware MPI installation and build CUDA.jl and MPI.jl with the CUDA and MPI used for the CUDA-aware MPI installation, right?
I don’t have experience with CUDA aware MPI + different CUDA toolkits (maybe @vchuravy or @simonbyrne do) . Generally the toolkit is backwards comptible though.
Thanks @vchuravy. This is what we have done. I am just trying to understand if it in general it was also possible to use CUDA-aware MPI with artifacts as I know multiple people getting started with small clusters / multi-GPU desktops with Julia, GPU and MPI… So to get started, or when a quick temporary solution is needed it could be nice to be able to use CUDA-aware MPI with artifacts…
Seconding this. AFAIK CUDA-aware MPI is the only way to do multi-gpu sync on a desktop machine/local workstation, but unlike in a cluster it’s almost certainly not installed and I haven’t been able to find binaries anywhere…
@maleadt, it would seem more intuitive to me that one needs to specify JULIA_CUDA_USE_BINARYBUILDER=false only at installation time. BTW: I think that is what @simonbyrne meant in this issue when he wrote " It would also be useful to have a mechanism to save these preferences so that they persist between sessions/versions (FFTW.jl and MPI.jl save their preferences to a file in .julia/prefs/".
I also believe that this would be an improvement - or is there anything that speak against it?