Lux + Enzyme naively applying model seems to cause runtime activity error

ForceBru · August 13, 2025, 10:01pm

Wow, seems like Optimization.LBFGS is seriously broken…

ChrisRackauckas · August 14, 2025, 4:52am

Yes it’s not registered yet. That’s part of the “but we need to finish and document it.” We use it in some projects already, especially in the GPU kernel solvers project for example, but it just needs a few finishing touches.

It is getting pulled out to be an extension, like all other solvers. Most of the work is already done New Subpackage for LBFGS by ParamThakkar123 · Pull Request #986 · SciML/Optimization.jl · GitHub it’s just breaking to merge that, so the whole big Optimization break (i.e. move the right things to OptimizationBase.jl with no solvers, change the preferred BFGS to be a proper native one with the right bells and whistles, remove some legacy stuff, etc.) will need to come at once.

Note that it’s just the classic FORTRAN LBFGS-B

The fact that it’s a Fortran code is why it also doesn’t support Reactant tracing.

[Now if you’re thinking, SciML never makes a wrapped code a default solver and misses type checking like that… yeah this is why I have been saying Optimization.jl is in desperate need of clean up. It needs to be moved out of being a privileged auto-installed solver (which isn’t something we do anywhere, the basic library is always solver-independent), it should be named OptimizationLBFGSB because that’s the Fortran name so it should mirror it, we should make sure there is a proper generic type supported core solver for this and benchmark it to death, etc. So yes this breaks a lot of standard SciML idioms, it’s known and pointing figures won’t fix it, I just need to find the time this fall to make Optimization.jl more like the other packages… but until then yes this is a little quirk I’ll need to root out. This is the next thing on my mind for after the dependency reduction / precompile improvements to OrdinaryDiffEq.jl / DifferentialEquations.jl … so more on that soon]

r0uv3n · August 25, 2025, 9:19am

As far as I can see SimpleOptimization.LBFGS will also actually not work right now if I presupply Reactant compiled gradients to OptimizationFunction (since instantiate_gradient is only defined for AutoForwardDiff and AutoEnzyme, and not for NoAD), right? Is there a nice way around this or should I just be patient on this front? I’m relatively happy with ReverseDiff+Optim.LBFGS right now, but Reactant compilation would be really helpful since right now I have GBs of memory usage just from Lux model applications inside my loss function.

Also @avikpal have you had a chance to look at the pure Enzyme situation here? It still seems weird to me that there’s seemingly no way to differentiate Lux model applications with respect to the parameters with Enzyme right now without hitting runtime activity or having to use Reactant.

r0uv3n · August 25, 2025, 10:56am

Really weird discovery I just made: The runtime activity with Enzyme+Lux disappears entirely if the point at which the model is evaluated is of type Vector{Int64} (I also quickly tested Vector{Int32}, also works). If the point is a Vector{<:Float}, I get the runtime activity. I assume there’s some dispatch weirdness going on somewhere.

Quick reproduction:

using Lux, Random, Enzyme
model = Dense(2 => 1)
ps,  st = Lux.setup(default_rng(), model)
# runs no problem
Enzyme.gradient(Reverse, only ∘ Lux.LuxCore.stateless_apply, Const(model), Const([0,0]), ps)
# fails because of runtime activity
Enzyme.gradient(Reverse, only ∘ Lux.LuxCore.stateless_apply, Const(model), Const([0f0,0f0]), ps)

Topic		Replies	Views
How to use Lux with Enzyme Machine Learning question	6	630	December 11, 2024
Autodiff of vector inputs with Enzyme.jl (and possibly Optimization.jl) Optimization (Mathematical) question , optimization , sciml , enzyme	9	1132	August 16, 2023
Reliability of Enzyme.jl General Usage question , diffeq , autodiff	11	2114	October 23, 2022
Lux, ComponentArrays and flat parameters : computing the gradient works with Zygote but not with Enzyme New to Julia enzyme	16	1785	May 14, 2024
Optimization crashes Julia 1.8 Optimization (Mathematical) diffeq	1	494	August 20, 2022

Lux + Enzyme naively applying model seems to cause runtime activity error

Related topics