Deployment questions: AOT and Portability

drjoke · November 23, 2017, 6:04am

AOT: Is it possible to enable AOT compilation? It takes around 10 seconds to compile the kernel. Compilation happens every time even when I rarely change the kernel.

Portability: When I run the code on a machine without CUDA, an error occurs at the first inclusion of “using CUDAdrv, CUDAnative” . Currently my strategy is to keep CUDANative code in separate files and only include them when the machine has CUDA. Is there a better elegant strategy to have the code run on a machine with and without CUDA other than selective inclusion of source files.

maleadt · November 23, 2017, 7:09am

That shouldn’t be the case. Yes, the first compilation takes a while, but that is mainly due to the CUDAnative.jl and LLVM.jl packages getting compiled to native code. Future developments on the Julia compiler should improve this.

Subsequent compilation of new or modified kernels should take less than 1 second, depending on the complexity.

Yes, conditional modules are a weak point of the package infrastructure. @MikeInnes has also been dealing with this, I think what we do in general is have conditional code depending on whether CUDAnative is installed (by checking Pkg.installed) and a device is available, checked at package build time setting a global flag. That way, you avoid nasty errors by having CUDAnative listed in REQUIRE but throwing a fit during Pkg.build.

mauro3 · November 23, 2017, 7:21am

I’m not sure whether this applies here, but checkout Revise.jl [ANN] Higher productivity (fewer Julia restarts) with Revise.jl.

MikeInnes · November 23, 2017, 12:18pm

RE portability: As far as possible, you should just deal with AbstractArrays. If you do need to special-case the GPU, e.g. with a custom kernel, you can handle that with Requires.jl:

@require CuArrays begin
  using CuArrays, CUDAnative
  # overload some functions for CuArray
end

The GPU workflow for your library will look something like:

using Foo, CuArrays

my_data = cu(my_data)
Foo.excellent_computations(my_data)

This is a pretty reasonable way to ask for GPU support, and it will just work if you do things as above – as well as being much more robust than things like checking Pkg.installed at compile time.

maleadt · November 23, 2017, 12:36pm

@drjoke seems to be using CUDAdrv/CUDAnative explicitly, not an array-based interface. But yeah, if that’s possible it is a better approach.

sdanisch · November 23, 2017, 12:49pm

You can also write your code against GPUArrays to circumvent this issue, since it’s hardware Independent, but still allows you to write GPU kernels.

See: Writing extendable and hardware agnostic GPU libraries | by Simon Danisch | techburst

Topic		Replies	Views
Weak dependence on CUDA packages, 2019 version GPU	3	770	September 22, 2019
CUDAnative is awesome! GPU	12	5976	December 3, 2018
Cannot build CUDAnative on login node (works on compute node) GPU question	3	1018	August 23, 2019
GSoC 17' Proposal \| Enabling Julia to target GPUs through Polly and imrpove code using run time information Community announcement	11	2217	April 3, 2017
Precompiling CUDAnative on login node (works on compute node) GPU	6	1140	August 23, 2019

Deployment questions: AOT and Portability

Related topics