Help using CUDA, Zygote, and random numbers

bgctw · December 4, 2024, 2:28pm

I get the error “llvmcall requires the compiler” when trying to take the gradient of a function that involves generating random numbers in CUDA. Here is a minimal example:

using GPUArraysCore: GPUArraysCore
using CUDA, Flux
using LinearAlgebra
using Zygote

function f3(v::AbstractVector{T}) where {T}
    randn(T, 4,4) * v[1:4]
end
function f3(v::GPUArraysCore.AbstractGPUVector{T}) where {T}
    CUDA.randn(T, 4,4) * v[1:4]
end
v_orig = collect(1.0:10.0)
Zygote.gradient(v -> sum(f3(v)), v_orig) # works

v = v_orig |> gpu
m = f3(v)
Zygote.gradient(v -> sum(f3(v)), v) # fails
Zygote.gradient(v -> sum(cpu(f3(v))), v) # fails

I suspect, I did not sufficiently understand the CUDA/Zygote workings yet. Could someone, please, explain to me why this fails, what I need to do, and point me to the resources to understand better?
Background for generating random numbers: I want to use a Monte-Carlo approximation of an expectation inside a cost function of a stochastic gradient descent.

gdalle · December 4, 2024, 2:50pm

Not sure why this fails but can you maybe generate the random numbers outside of your differentiated function and pass them as arguments?

bgctw · December 4, 2024, 3:41pm

Yes, thanks, passing in another pre-allocated CuArray of randon numbers works, that I have tried before.

However, it is awkward and probably not very efficient to pre-allocate a lot of random-data and pass it with a DataLoader of a machine-learning optimization to the cost function. There are much fewer observations, covariates, and parameters compared the number of randoms that I need. Maybe, I need to implement a special DataLoader that generates the random-numbers when asked for the next batch.

mcabbott · December 4, 2024, 5:17pm

What’s probably happening is that Zygote is trying to differentiate the code inside CUDA.randn, which ultimately calls non-Julia code via llvmcall.

The reason that it does not try to do this with randn is that there’s a rule instructing it not to look, here.

You can define such a rule for CUDA.randn in your code, or make a PR adding it for everyone here.

In general, you can also tell Zygote to ignore some bit of code by doing this (or the ChainRulesCore equivalent):

r = Zygote.@ignore CUDA.randn(T, 4,4)
r * v[1:4]

bgctw · December 23, 2024, 8:59am

Thanks for this guide.

With the undocumented and deprecated-warning
Zygote.@ignore ignore approach, I can continue developing. And a more lasting general solution at the ChainRulesCore repo is in progress.

Topic		Replies	Views
Zygote errors on simple operations with Complex CUDA Arrays GPU cuda , zygote	0	391	May 4, 2021
Autodiff with Zygote: issues with setting seeds Statistics zygote , autodiff	9	138	November 11, 2024
ERROR: this intrinsic must be compiled to be called General Usage flux , zygote , cuarrays	9	2012	December 29, 2020
How compute `gradient.(f, w)` on GPU? Machine Learning first-steps	2	913	November 16, 2019
LoadError: `llvmcall` must be compiled to be called when calling Zygote.Jacobian Machine Learning question	2	230	July 15, 2024

Help using CUDA, Zygote, and random numbers

Related topics