Can I change the nvcc location in CUDAnative?


#1

Hi, when I am building “CUDAnative” package on a server, I get the following error:

               _
   _       _ _(_)_     |  A fresh approach to technical computing
  (_)     | (_) (_)    |  Documentation: https://docs.julialang.org
   _ _   _| |_  __ _   |  Type "?help" for help.
  | | | | | | |/ _` |  |
  | | |_| | | | (_| |  |  Version 0.6.2 (2017-12-13 18:08 UTC)
 _/ |\__'_|_|_|\__'_|  |
|__/                   |  x86_64-redhat-linux

julia> Pkg.build("CUDAnative")
INFO: Building LLVM
INFO: LLVM.jl has already been built for this toolchain, no need to rebuild
INFO: Building CUDAdrv
INFO: Building CUDAnative
=================================================================================[ ERROR: CUDAnative ]=================================================================================

LoadError: could not spawn `/sw/software/cuda/9.1/centos7.3_binary/nvcc --version`: no such file or directory (ENOENT)
while loading /d/home/xiaoqihu/.julia/v0.6/CUDAnative/deps/build.jl, in expression starting on line 155

I looked up nvcc, it is located at /sw/software/cuda/9.1/centos7.3_binary/bin/nvcc, therefore it can’t spawn because it is spawning at the wrong location.

I also looked up where can I change this behavior. Correct me if I am wrong, I think it is the find_toolkit() in CUDAapi package. In the documentation of this function, it says:

...
The behavior of this function can be overridden by defining the `CUDA_PATH`, `CUDA_HOME` or
`CUDA_ROOT` environment variables, which should point to the root of the CUDA toolkit.

I have tried to define CUDA_HOME as /sw/software/cuda/9.1/centos7.3_binary/bin/ but it still gives the above error.


#2

OK. I thought that I need to set the environment variable in shell, but after reading the code in find_toolkit(), I need to set the environment variable in Julia ENV. Like this:

Julia> ENV["CUDA_HOME"] = " /sw/software/cuda/9.1/centos7.3_binary/bin/"

That solved the problem.


#3

That’s not true, both are identical, so guess you didn’t properly set the variable in your shell. I’m glad it works though :slightly_smiling_face:


#4

Ok. I did test it again and you are right. Thank you.
After I set CUDA_HOME to /sw/software/cuda/9.1/centos7.3_binary/bin/, Julia can’t find libdevice anymore. I have this error:

$ TRACE=1 ~/julia0.6/julia --compilecache=no ~/.julia/v0.6/CUDAnative/deps/build.jl
TRACE: LLVM.jl is running in trace mode, this will generate a lot of additional output
DEBUG: Checking validity of bundled library at /nics/d/home/xiaoqihu/julia0.6/usr/lib/libLLVM-3.9.1.so
ERROR: LoadError: Available CUDA toolchain does not provide libdevice
Stacktrace:
 [1] main() at /d/home/xiaoqihu/.julia/v0.6/CUDAnative/deps/build.jl:122
 [2] include_from_node1(::String) at ./loading.jl:576
 [3] include(::String) at ./sysimg.jl:14
 [4] process_options(::Base.JLOptions) at ./client.jl:305
 [5] _start() at ./client.jl:371
while loading /d/home/xiaoqihu/.julia/v0.6/CUDAnative/deps/build.jl, in expression starting on line 155
DEBUG: Dropping down to post-finalizer I/O

I did find /sw/software/cuda/9.1/centos7.3_binary/nvvm/libdevice/libdevice.10.bc.


#5

CUDA_HOME should probably be /sw/software/cuda/9.1/centos7.3_binary, not pointing to the bin directory within. Could you try that? If that doesn’t work, please run with 0.7 and JULIA_DEBUG=CUDAapi and file an issue on CUDAapi.


#6

If I set it as /sw/software/cuda/9.1/centos7.3_binary. The error in my original question happens again: ERROR: LoadError: could not spawn/sw/software/cuda/9.1/centos7.3_binary/nvcc --version: no such file or directory (ENOENT)

Also, for future reference, is it better to directly post the question on git or ask questions here on the discourse? I am newbie when it comes community manners.

Here is the debugging information using 0.7:

julia> Pkg.build("CUDAnative")
  Building LLVM ──────→ `~/.julia/packages/LLVM/FAUY/deps/build.log`
  Building CUDAdrv ───→ `~/.julia/packages/CUDAdrv/GyXD/deps/build.log`
  Building CUDAnative → `~/.julia/packages/CUDAnative/mXUk/deps/build.log`
┌ Error: Error building `CUDAnative`:
│ ┌ Debug: Looking for CUDA toolkit via environment variables
│ │   CUDA_HOME = "/sw/software/cuda/9.1/centos7.3_binary/"
│ └ @ CUDAapi CUDAapi.jl:15
│ ┌ Debug: Request to look for binary nvcc
│ │   locations =
│ │    1-element Array{String,1}:
│ │     "/sw/software/cuda/9.1/centos7.3_binary/"
│ └ @ CUDAapi CUDAapi.jl:15
│ ┌ Debug: Looking for binary nvcc
│ │   locations =
│ │    25-element Array{String,1}:
│ │     "/sw/software/cuda/9.1/centos7.3_binary/"
│ │     "/sw/software/cuda/9.1/centos7.3_binary/bin"
│ │     "/sw/software/cuda/9.1/centos7.3_binary/bin"
│ │     "/usr/local/bin"
│ │     ⋮
│ │     "/d/home/xiaoqihu/cuda/bin"
│ │     "/d/home/xiaoqihu/bin"
│ │     "/d/home/xiaoqihu/cuda/bin"
│ └ @ CUDAapi CUDAapi.jl:15
│ ┌ Debug: Found binary nvcc at /sw/software/cuda/9.1/centos7.3_binary
│ └ @ CUDAapi discovery.jl:126
│ ERROR: LoadError: could not spawn `/sw/software/cuda/9.1/centos7.3_binary/nvcc --version`: no such file or directory (ENOENT)
│ Stacktrace:
│  [1] _jl_spawn(::String, ::Array{String,1}, ::Cmd, ::Tuple{Base.DevNullStream,Base.PipeEndpoint,RawFD}) at ./process.jl:370
│  [2] (::getfield(Base, Symbol("##495#496")){Cmd})(::Tuple{Base.DevNullStream,Base.PipeEndpoint,RawFD}) at ./process.jl:512
│  [3] setup_stdio(::getfield(Base, Symbol("##495#496")){Cmd}, ::Tuple{Base.DevNullStream,Pipe,IOStream}) at ./process.jl:493
│  [4] #_spawn#494(::Nothing, ::Function, ::Cmd, ::Tuple{Base.DevNullStream,Pipe,IOStream}) at ./process.jl:511
│  [5] _spawn(::Cmd, ::Tuple{Base.DevNullStream,Pipe,IOStream}) at ./process.jl:507
│  [6] #open#504(::Bool, ::Bool, ::Function, ::Cmd, ::Base.DevNullStream) at ./process.jl:601
│  [7] open at ./process.jl:591 [inlined]
│  [8] open(::Cmd, ::String, ::Base.DevNullStream) at ./process.jl:572
│  [9] read(::Cmd) at ./process.jl:646
│  [10] read(::Cmd, ::Type{String}) at ./process.jl:652
│  [11] find_toolkit_version(::Array{String,1}) at /d/home/xiaoqihu/.julia/packages/CUDAapi/g08Z/src/discovery.jl:259
│  [12] main() at /d/home/xiaoqihu/.julia/packages/CUDAnative/mXUk/deps/build.jl:114
│  [13] top-level scope at none:0
│  [14] include at ./boot.jl:317 [inlined]
│  [15] include_relative(::Module, ::String) at ./loading.jl:1075
│  [16] include(::Module, ::String) at ./sysimg.jl:29
│  [17] include(::String) at ./client.jl:393
│  [18] top-level scope at none:0
│ in expression starting at /d/home/xiaoqihu/.julia/packages/CUDAnative/mXUk/deps/build.jl:156
└ @ Pkg.Operations Operations.jl:973