Suppose I have a package and that I am interested in providing an option to end users to speed up their calculations in the presence of a NVIDIA GPU. Not everyone will benefit from this option and so I can’t add CUDA as a dependency. Is there a solution specifically designed for the CUDA.jl stack that allows package developers to add the stack as an optional dependency?
Note: I am aware of Requires.jl but I can’t afford the lack of support for semantic versioning.
Is it a problem to include CUDA as a dependency?
If there is no GPU available, you can then decide what to do.
That is also mentioned in the CUDA.jl docs.
There is a simple solution to this, which has been adopted by many packages already:
add CUDA.jl as a normal dependency
do using CUDA in your package
check at runtime if CUDA is functional or not (with CUDA.functional()) and use the GPU in function of this.
The installation of CUDA.jl as well as using CUDA and compiling CUDA.jl code will work without error on a system without GPU. Only at runtime, trying to use the GPU for real, it would fail. But, this you will avoid by not doing so if CUDA.functional() returns false.
Any system that runs Julia will successfully build CUDA.jl? How often the build fails on different OSes, platforms, etc? Ideally I would like to avoid an extra dependency, specially if it has some potential to reduce the number places where a package can run.
We put a lot of effort in making sure that CUDA.jl can be loaded, even if CUDA is not available. You can use CUDA.functional() to check if your system has CUDA available.
CUDA.jl does nothing before its first use. Even artifacts are downloaded at run time. So the only ‘cost’ is additional dependencies, and package load time. The latter can still be optimized, so if somebody’s familiar with the recent techniques for improving latency (optimizing precompilation, avoiding invalidations) that would be a valuable contribution.
Yes, that is a big concern I have as well. Downloading a bunch of binaries and packages always seems too heavy, specially if you take into account countries under development with poor internet connection. Even though CUDA.functional() is wonderful (thanks for sharing) it still doesn’ t fully solve the issue from a software engineering perspective.
Maybe someday in the future we could have a julia --gpu flag or something cooked into the language like we do for threads and distributed computing? I am just wondering if that would make sense at all. No need to install CUDA, OpenCL, … dependencies manually, the language itself would have GPU support baked in.
From heaviest to lightest, which ones could potentially introduce non-trivial issues during installation? I see a lot of compiler packages, but I don’t understand how challenging it is to install these packages in old machines (thinking of students in countries under development).
No, most of those packages are fairly small. We’ve taken care to isolate larger dependencies in separate packages (like NNlibCUDA.jl) or by using Requires.jl
There’s a couple of packages that require binary dependencies, but those work well enough nowadays to not be a problem.
from the fact that one of the binary libraries may link against CUDA_jll.
SCS is a tiny (100sKB) C-library; we ship separate SCS_GPU_jll which has CUDA_full_jll as build dep only. The aim is to add the gpu functionality to the package only conditionally, when CUDA_full is already present.
You mean CUDA_jll? CUDA_full_jll should never be present, as you mention it’s a build dep only.
That said, currently the way CUDA.jl handles its CUDA toolkit dependency is entirely separate from CUDA_jll.jl. We hope to unify that in the future, but as of now you can’t have both loaded at the same time.
We use CUDA_full_jll as build dep and require users to using CUDA_jll before using SCS;
This way we can define gpu-related functionality in SCS.jl through Requires and only include it when CUDA is present. This e.g. defines some constants in the package.
However, maybe it’s time to revisit the solution. What do you suggest?
That’s the correct set-up indeed, but you mentioned CUDA_full twice in your previous post, so I assume you meant CUDA_jll. This solution doesn’t integrate with CUDA.jl yet, but I hope to take another look at this in the new year.
It seemed like an interesting idea to ask for computational resources explicitly in user scripts. If the language had this notion of resource built-in we could switch between different resources without depending on packages I guess? Just a set of abstract types defining the resources and a convention to implement all Base methods with resources as the first argument when possible. Falling back to the CPU.