CuArrays/CUDAnative PSA: Simplified package loading

Hi all,

I’ve just tagged new versions of CuArrays/CUDAnative/… and among the usual slurry of features and bug fixes there’s a major change in how the packages are built and loaded. There used to be a Pkg.build step which could fail if you didn’t have a properly set-up CUDA GPU. This has made a lot of people very angry and been widely regarded as a bad move.

As an alternative, I have dropped this installation-time set-up and moved it to the precompilation phase, i.e., when you first import the package. As a result, you can now safely depend on CUDA packages since they won’t ever fail during installation. This is especially useful for clusters and containers, where you want to install packages in an environment that probably does not have a GPU.

Of course, loading the package might still fail if your user doesn’t have a CUDA GPU, so that’s why CUDAapi now provides a couple of useful functions to determine that:

using CUDAapi # this will NEVER fail
if has_cuda()
    try
        using CuArrays # we have CUDA, so this should not fail
    catch ex
        # something is wrong with the user's set-up (or there's a bug in CuArrays)
        @warn "CUDA is installed, but CuArrays.jl fails to load" exception=(ex,catch_backtrace())
    end
end

There’s also CUDAapi.has_cuda_gpu() to check if the user actually has a GPU.

As a result of all this, it should be possible to safely depend on any of the CUDA packages, without your users seeing errors because of not having a CUDA GPU. This is important, because it means we can use regular package version compatibility rules and don’t have to roll our own.

17 Likes

Two notes based on user feedback:

  1. if you see an error message LoadError: LoadError: UndefVarError: libcudnn not defined, this probably comes from Flux, which needs to be updated for the new version of CuArrays. Pin CuArrays for the time being. If the error comes from somewhere else, please file an issue.

  2. Loading might fail with Could not find library 'cublas': this library should be part of the CUDA toolkit, so we’ve become more strict about it being available (since it underpins lots of essential functionality in CuArrays). Please make sure your CUDA installation is OK, and provides libcublas. If it does, run the failing using CuArrays with JULIA_DEBUG=CUDAapi and create an issue with details about your system and the location of libcublas.