Second order gradient with Lux, Zygote, CUDA, Enzyme

avikpal · December 31, 2024, 11:07pm

using Lux, CUDA, cuDNN, Random, OneHotArrays, Zygote
using Functors, Optimisers, Printf

model = Chain(
    Conv((5, 5), 1 => 6, relu),
    MeanPool((2, 2)),
    Conv((5, 5), 6 => 16, relu),
    MeanPool((2, 2)),
    FlattenLayer(3),
    Chain(
        Dense(256 => 128, relu),
        Dense(128 => 84, relu),
        Dense(84 => 2)
    )
)

dev = gpu_device(; force=true)

ps, st = Lux.setup(Random.default_rng(), model) |> dev;

x = randn(Float32, 28, 28, 1, 32) |> dev;
δ = randn(Float32, 28, 28, 1, 32) |> dev;
y = onehotbatch(rand((1, 2), 32), 1:2) |> dev;

const celoss = CrossEntropyLoss(; logits=true)
const regloss = MSELoss()

function loss_function(model, ps, st, x, y)
    pred, _ = model(x, ps, st)
    return celoss(pred, y)
end

function ∂xloss_function(model, ps, st, x, δ, y)
    smodel = StatefulLuxLayer{true}(model, ps, st)
    ∂x = only(Zygote.gradient(Base.Fix2(celoss, y) ∘ smodel, x))
    regloss(∂x, δ) + loss_function(model, ps, st, x, y)
end

function ∂∂xloss_function(model, ps, st, x, δ, y)
    only(Zygote.gradient(ps -> ∂xloss_function(model, ps, st, x, δ, y), ps))
end

∂∂xloss_function(model, ps, st, x, δ, y)

I have patched the support for (log)softmax and MeanPool (MaxPool is a bit finicky to write the jvp for, so I try will do that later) in feat: more nested AD rules by avik-pal · Pull Request #1151 · LuxDL/Lux.jl · GitHub. I will merge and tag it later tonight once tests pass

also note that in the original example cuDNN (or LuxCUDA) wasn’t loaded so it wasn’t able to use the correct versions of the algorithms.

Topic		Replies	Views
Lux (And Flux), "parallel" Network Input. When Input is flat, Zygote gradient works, when input is not flat it doesn't Machine Learning flux , zygote , lux	10	672	February 5, 2024
Lux, ComponentArrays and flat parameters : computing the gradient works with Zygote but not with Enzyme New to Julia enzyme	16	1670	May 14, 2024
[ANN] Lux.jl: Explicitly Parameterized Neural Networks in Julia Package Announcements package , announcement , machine-learning	50	11294	April 27, 2024
Zygote gradient error with `reduce` on GPU Machine Learning cuda , zygote	3	317	February 6, 2023
Fast Hessian and Gradient for PINNS using Enzyme/Zygote Performance question , flux , zygote , enzyme , hessian	0	357	July 23, 2023

Second order gradient with Lux, Zygote, CUDA, Enzyme

Related topics