Compiling CUDA code with BinaryBuilder

I apologize in advance for my question, because I know it is not very specific, but at this point I do not know where else to go for help.

I’ve been trying to compile some CUDA code using BinaryBuilder, but I haven’t been able to succeed. I really don’t think I need anything fancy to compile my code, but I do need nvcc, and this is what I have been struggling with. Here is what I have tried so far:

Trying to load CUDA as a dependency

I tried copying some build_tarballs.jl like NCCL’s build_tarballs.jl, but I can’t even get that to run. When I try to NCCL’s script with julia build_tarballs.jl --verbose --debug (only a locally cloned version of Yggdrasil) I get an error:

ERROR: LoadError: KeyError: key v"11.4.4" not found

where 11.4.4 corresponds to the CUDA version that is trying to be installed. The problem here seems to be that CUDA.required_dependencies (from platforms/cuda.jl) outputs dependencies that can’t be found. Here is a concrete example:

For the target x86_64-linux-gnu-cuda+11.4, that is, platform = Platform("x86_64", "linux"; libc="glibc", cuda="11.4"), CUDA.required_dependencies(platform) returns the following two dependencies:

  • BuildDependency(PackageSpec(name="CUDA_SDK_jll", version=v"11.4.4"))
  • BuildDependency(PackageSpec(name="CUDA_Runtime_jll"))

But then when I run build_tarballs with these dependencies, I get the keyerror 11.4.4 not found error. So then I figured I would put the exact version string that I found in the JuliaBinaryWrappers repo on github. At this point, I have the following:

platforms = [Platform("x86_64", "linux"; libc="glibc", cuda="12.3")]
dependencies = [
	HostBuildDependency(PackageSpec(; name="CMake_jll", version=v"3.28.1+0")), # We need cmake >= 3.18, but by default it is 3.17.2
    BuildDependency(PackageSpec(name="CUDA_SDK_jll", version=v"12.3.2+0")),

But now when I get dropped into the sandbox with these dependencies, nvcc doesn’t work. I tried compiling a hello world, but I get the following error:

/opt/x86_64-linux-gnu/bin/../lib/gcc/x86_64-linux-gnu/4.8.5/../../../../x86_64-linux-gnu/bin/ld: cannot find -lcudadevrt
/opt/x86_64-linux-gnu/bin/../lib/gcc/x86_64-linux-gnu/4.8.5/../../../../x86_64-linux-gnu/bin/ld: cannot find -lcudart_static

This led me to an issue for NCCL’s build, but nothing I found there seemed to solve my problems. I also thought that since the error is talking about cudart_static, that I might try and add CUDA_SDK_static_jll as a dependency, but I still have the same error from ld when running nvcc. I also tried adding CUDA_full_jll, but that didn’t help either. I feel like I am very close with this setup, but I cannot get cudadevrt or cudart_static to be found. (I of course did try to add $prefix/cuda/lib to LD_LIBRARY_PATH, but that did not do anything either).

Trying to manually install CUDA in the sandbox

Then I figured I might as well just install CUDA from the sandbox environment. I found this recipe which I started copying, but they have install scripts for CUDA 10 which I don’t think supports the compute architecture sm_80 which I need. I tried to use more up-to-date files:

sources = [
    FileSource("", "298936c727b7eefed95bb87eb8d24cfeef1f35fecac864d98e2694d37749a4ad"),
    FileSource("", "24b2afc9f770d8cf43d6fa7adc2ebfd47c4084db01bdda1ce3ce0a4d493ba65b"),

but I’ve honestly had a hard time trying to install CUDA this way within the sandbox. This is where I am stuck currently.

Is there a better way of doing this? Is there a certain set of dependencies that lets me have access to nvcc, or do I have to build CUDA myself in the sandbox?

Thank you!

I’m not an expert in this. As far as I can see the CUDA_Runtime provides ptxas which is a compiler downstream of nvcc. Not sure if that helps, but as far as I can tell CUDA.jl uses that level and hence there was. o need for nvcc.

In the existing build scripts nvcc seems to be at a specific path, that’s maybe why it wasn’t available in the sandbox, see line in build script. Maybe you could suggest a PR to add nvcc as a file product.

Just as a side question, looking at the code it seems using CUDA.jl would be easier to maintain. (But I don’t know if there are bits that don’t work. Obviously you know what you do.)

1 Like