[ANN] GPUEnv.jl – Dynamically created overlay GPU backend environments for testing and benchmarking

hakkelt · May 19, 2026, 2:53pm

Hi everyone!

I’m excited to announce GPUEnv.jl, a new utility package designed to make life easier for developers maintaining Julia packages that support multiple GPU backends.

The Problem

If your package supports CUDA.jl, AMDGPU.jl, Metal.jl, oneAPI.jl, etc., it can be a headache to manage your test/ or benchmark/ environments. Including all those backends permanently makes the parent environment incredibly slow to resolve and unnecessarily large. However, if you don’t include them, it is hard to automatically test or benchmark your code on whatever GPU hardware happens to be available on the host machine.

The Solution: GPUEnv.jl

GPUEnv.jl solves this by building a temporary (or persisted) overlay environment on top of your active project. It detects which hardware is actually available on the host machine, asks Pkg to resolve only those relevant backend packages, and leaves your parent project entirely unchanged.
Currently supported backends: JLArrays, CUDA, AMDGPU, Metal, oneAPI, and OpenCL.

Key Use Cases & Examples

1. Test Overlays

You want CPU-only CI coverage via JLArrays, but you also want to exercise real hardware (CUDA, Metal, etc.) if it’s available on the machine running the tests:

# test/runtests.jl
using GPUEnv
using Test

# Creates an overlay environment with JLArrays + any detected native GPUs
GPUEnv.activate(; include_jlarrays = true, persist = true)

for backend in gpu_backends(; include_jlarrays = true)
    x = gpu_ones(backend, Float32, 64, 64)
    y = gpu_ones(backend, Float32, 64, 64) .* 2
    @test Array(x + y) == 3f0 .* ones(64, 64)
end

2. Benchmark Overlays

For benchmarking, you typically want to skip JLArrays since CPU mocks aren’t representative of GPU performance. GPUEnv can fetch native backends only, and you can easily skip the benchmark if no native GPU is found:

# benchmark/gpu_benchmark.jl
using GPUEnv
using BenchmarkTools

GPUEnv.activate(; include_jlarrays = false, only_first = true)

backends = gpu_backends(; include_jlarrays = false)
if isempty(backends)
    println("No functional native GPU backend found; skipping benchmark run.")
else
    backend = first(backends)
    x = gpu_randn(backend, Float32, 1024)
    @btime begin
        $x .+ $y
        synchronize_backend($backend)
    end
end

3. Backend Prediction and Unified Allocation

Downstream code can query the installed backends and allocate arrays through a small common interface instead of branching on CUDA versus AMDGPU versus Metal everywhere:

using GPUEnv

predicted = predict_backends()
@show predicted

for backend in gpu_backends(; include_jlarrays = true)
    x = gpu_zeros(backend, Float32, 64, 64)
    y = gpu_ones(backend, Float32, 64, 64)
    z = gpu_randn(backend, Float32, 64, 64)
    @show backend.name typeof(z)
end

How it works

GPUEnv prefers direct host hints for backend prediction (Linux device nodes, Windows video controller names, macOS display info) and falls back to command-line utilities such as nvidia-smi or rocminfo if needed. Once it predicts the backends, it copies your project context, safely injects the necessary GPU packages via Pkg, and activates it. You can even pass persist = true to cache this overlay environment in a local gpu_env/ folder to save resolving time on subsequent runs!

Links & Resources

GitHub: GitHub - hakkelt/GPUEnv.jl: Detect available GPU backends and create overlay environments with them · GitHub
Documentation: https://hakkelt.github.io/GPUEnv.jl/

Feedback, feature requests (to a limited extent), and PRs are incredibly welcome. If you are developing a package that dispatches across multiple GPU backends, give it a try and let me know how it fits into your workflow!

Topic		Replies	Views
Testing GPU compatability in CI GPU gpu , ci , package-extensions	2	154	September 4, 2024
Emulate GPU on CPU with CUDA.jl GPU gpu , cuda	2	717	November 8, 2023
AdaptiveCpp integration? GPU cuda , opencl , oneapi , rocm	9	525	May 20, 2025
Suggestion: abstraction for integrated GPUs? GPU gpuarrays , oneapi , hardware , rocm , metaljl	7	293	July 16, 2024
Test a package with a CUDA dependency through GitHub Actions CI GPU package , ci , github-actions	2	342	July 31, 2024