We are pleased to announce the first stable release of JACC.jl v1.0 - Julia for ACCelerators https://github.com/JuliaGPU/JACC.jl - feel free to star the repo.
What’s different?
Single-source Julia code using array, parallel_for/parallel_reduce (hence no GPU programming required) for vendor-neutral computing
High performance on NVIDIA, AMD, Intel, Apple + CPUs
No vendor-specific code or function annotations. Backend is set outside to keep code 100% portable.
Enable interactive parallel code development (see below)
Work closely with the community of users and contributors as in our v1.0 API specs discussion
High-level APIs: JACC.jl will make the best guess separating computational (science code) from computer (low-level knobs) science
Low-level optional APIs: blocks, threads, sync, streams, etc.
MultiGPU, Async GPU execution, and shared memory support
Repo follows the OpenSSF best practices badge

JACC.jl is a registered Julia package, follows standard installation.
Set backend (if needed):
julia -e 'using JACC; JACC.set_backend("CUDA")'
Supported backends: CUDA, AMDGPU, Metal, oneAPI, Threads (default)
Portable CPU/GPU code example:
import JACC
JACC.@init_backend
function axpy(i, alpha, x, y)
@inbounds x[i] += alpha * y[i]
end
N = 100_000
alpha = Float32(2.0)
x = JACC.zeros(Float32, N)
y = JACC.array(fill(Float32(5), N))
JACC.@parallel_for range=N axpy(alpha, x, y)
sum_x = JACC.@parallel_reduce range=N ((i,x)->x[i])(x)
println("Result: ", sum_x)
Our team at Oak Ridge National Laboratory is working hard to close gaps in Julia for HPC programming. We thank the contributions of the Julia community, in particular the JuliaGPU vendor-specific backends and our US Department of Energy sponsors hoping JACC.jl helps people adopting Julia for productive science using parallel computing. Feedback is welcome and we’d like to hear on performance gaps, issues, use-cases, etc.
Papers:
Preliminary work:
More information in the JACC.jl repo and the documentation.