Lightweight dependency for GPU programming

As a package author is it possible to define KernelAbstractions.jl in my package without adding heavy dependencies? I see that the package currently depends on

[deps]
Adapt = "79e6a3ab-5dfb-504d-930d-738a2a938a0e"
Atomix = "a9b6321e-bd34-4604-b9c9-b65b8de01458"
GPUCompiler = "61eb1bfa-7361-4325-ad38-22787b887f55"
InteractiveUtils = "b77e0a4c-d291-57a0-90e8-8db25a27a240"
LLVM = "929cbde3-209d-540e-8aea-75f648917ca0"
MacroTools = "1914dd2f-81c6-5fcd-8719-6d5c9610ff09"
OpenCL_jll = "6cb37087-e8b6-5417-8430-1f242f1e46e4"
PrecompileTools = "aea7be01-6a6a-4083-8856-8a6e6704d82a"
Printf = "de0858da-6303-5e67-8744-51eddeeeb8d7"
SPIRVIntrinsics = "71d1d633-e7e8-4a92-83a1-de8814b09ba8"
StaticArrays = "90137ffa-7385-5640-81b9-e52037218182"
UUIDs = "cf7118a7-6976-5b1a-9a39-7adc72f591a4"
pocl_jll = "627d6b7a-bbe6-5189-83e7-98cc0a5aeadd"

Could we get just the KA syntax, and leave the functionality to package extensions?

The workflow I have in mind starts with me writing a kernel inside my package:

module MyPkg
  using KernelAbstractions

  @kernel function f(...)
    # do something
  end
end

and users loading the desired device to run the function:

using MyPkg
using CUDA
using ScopedValues

with(device=CUDADevice()) do
  MyPkg.f(...)
end

I know this is probably not possible today, but I wonder what would be the best approximation currently?

3 Likes

Which of these dependencies are too heavy? They are all intended to be relatively lightweight packages, e.g., there’s no back-ends included. IIUC the next breaking release of KA.jl might also exclude the CPU back-end, further reducing the set of dependencies (excluding the OpenCL and pocl JLLs).

This is subjective of course, but imagine a lightweight package that does one thing and one thing only, and does it very well in native Julia. This package can be ported to all sorts of exotic devices and platforms officially supported by the language.

I don’t want to increase compilation time downstream, nor I want to take the risk of an external binary that can fail to build on a given platform. That means that I usually don’t take JLLs as dependencies.

Ideally (and I know this is hard) we would have the core macros defined in KernelAbstractions.jl, and anything else related to implementations defined in extensions like CUDAKernelAbstractionsExt.jl, OpenCLKernelAbstractionsExt.jl.

That way users who don’t have any interest in OpenCL functionality wouldn’t be affected by potential compilation issues, etc.

That is literally how things work right now. The only reason OpenCL is a direct dep is that the CPU back-end used to be available by default, so moving it out is a breaking change which hasn’t happened yet.

3 Likes

Would it be too hard to implement the CPU back-end in native Julia? I understand that it is the fallback when users want to dispatch their code on CPUs as usual.

Yes. It was previously implemented like that, and it is prohibitively hard to map GPU SPMD semantics onto Julia constructs without significant compiler analyses (and with the code performing well, of course). PoCL does that for us.

Thank you for clarifying.

Do you mean that a next breaking release will also drop OpenCL_jll?

I think @vchuravy was considering this, yes. I’ll leave the confirmation to him.

1 Like