Enzyme/Reactant: second-order AD (Laplacian) not working with Lux on GPU

Hi all,

I’m implementing VMC (Variational Monte Carlo) for quantum many-body systems in the continuum, where I need to compute the Laplacian ∇²logψ/ψ of a neural network
wavefunction. I’m trying to use Lux + Reactant + Enzyme on GPU.

What works:

  • Flux + Zygote: first-order gradients ∇_θ E on GPU :white_check_mark:
  • Lux + Reactant + Enzyme: first-order gradients ∇_θ E on GPU :white_check_mark:
  • Lux + Enzyme CPU: first-order gradients on CPU :white_check_mark:

What doesn’t work:
Second-order AD for the Laplacian via Enzyme.hvp, either with Lux or plain Julia.

MWE:

using Lux, Enzyme, Random

model = Lux.Chain(
Lux.Dense(30 => 64, tanh),
Lux.Dense(64 => 64, tanh),
Lux.Dense(64 => 1)
)

rng = Random.default_rng()
ps, st = Lux.setup(rng, model)
x = randn(Float32, 30)

logpsi(x) = sum(Lux.apply(model, reshape(x, :, 1), ps, st)[1])

First order: works fine

g = Enzyme.gradient(Enzyme.Reverse, Const(logpsi), x)

Second order via hvp: LLVM error

function laplacian(x)
N = length(x)
lap = 0f0
for i in 1:N
ei = zeros(Float32, N); ei[i] = 1f0
hv = Enzyme.hvp(logpsi, x, ei)
lap += hv[i]
end
return lap
end

laplacian(x) # fails with LLVM error

Error: LLVM error: broken gc calling conv fix / No augmented forward pass
found for LuxLib matmul operations.

Question: Is second-order AD through Lux layers supported with Enzyme?
Is there a recommended way to compute the exact Laplacian without finite differences?

Workaround: Currently using finite differences, which is actually faster than
Zygote double AD, but exact AD would be preferable for larger systems.

Julia 1.12.5, Enzyme v0.13.136, Lux (latest), Reactant (latest), CUDA 13.2

am I doing anything wrong, or not applying things correctly? Or it is just what it is right now?

We aware, though, that these examples were generated with IA, as I’m not yet sure whether it is ok to start learning all these frameworks yet (if they do not work well together today, I may stay with the old trusty Flux+Zygote which does work well -but not that fast)…

what is your reactant code, this is a pure enzyme version? That should be fastest regardless

I would like to go step by step, as I want to check first if it works, rather than learning all from the docs to see that it does not yet work…
But in any case, it is not faster, it simply freezes and hangs in an eternal loop (ie, in jupyter spits nothing while the cell is marked as running ["]).
I am running the latest julia version 1.12.something and run update almost every session, so everything should be at the latests release…
Thanks.

so counter-intuitively there are plenty of cases where reactant+enzyme will work where no reactant enzyme won’t

oh weird, that feels like something is going wrong if it’s hanging, what’s the exact code (and maybe file an issue)?

…I just posted two code examples in my first post. I wonder if they work on somebody else’a computer (runbing latest julia etc)…