Is it possible to evaluate the objective function within the "Optimisers.apply!" function?

lorena · July 17, 2025, 6:47pm

I am working on implementing the MoMo method for adaptive learning rates in momentum optimizers, and it requires an evaluation of the objective function within the update rule.

Until now I have been implementing my own optimization rules by creating a struct for the specific optimizer I’m working on and then creating a method of the “Optimisers.apply!” function in conjunction to a method of the “Optimisers.init” function to deal with my custom optimizer, as I have learned from the “Optimisers.jl” docs. The “Optimisers.apply!” takes in the optimizer, state, parameters and gradient as arguments, and the “Optimisers.init” function takes in the optimiser and the parameters as arguments. Until now everything worked perfectly because that’s all I needed.

But the MoMo algorithm does require the evalution of the objective function to complete the update. Is it possible to do that within the “Optimisers.apply!” and “Optimisers.init” functions, and, if not, is there another way to implement this algorithm using Flux?

mcabbott · July 17, 2025, 6:56pm

I don’t know about the MoMo algorithm, but can confirm that by default the update step sees the state and the gradient, but not the loss.

One quick way to pass the loss would be to have the rule contain a Ref(0f0), which you write into before calling update!. I haven’t tried, but something like this:

struct DecayDescent <: Optimisers.AbstractRule
  eta::Float64
  loss::Base.RefValue{Float64}
end

rule = DecayDescent(0.1, Ref(NaN))
opt_state = Flux.setup(rule, model)  # every Leaf should see the same Ref

obj, grads = Flux.withgradient(m -> lossfun(m(x), y), model)
rule.loss[] = obj  # should change what's seen by each `apply!` in here:
Flux.update!(opt_state, model, grads[1])

Bectuld3 · August 6, 2025, 11:35am

No, the Optimisers.apply! function itself does not directly evaluate the objective function; instead, you must evaluate the objective function and compute its gradients before calling Optimisers.apply!. The apply! function’s purpose is to update the model’s parameters based on the gradients and the chosen optimization rule, not to calculate the objective function’s value or gradient. To achieve this, you typically use a function like Flux.withgradient, which calculates the objective function’s value and its gradients simultaneously before passing the gradients to the optimizer’s apply! function for the update step.

Topic		Replies	Views
Lifting a Julia function into a Flux "layer" Machine Learning flux	7	2242	May 29, 2019
Flux.jl is not modifying the loss Machine Learning	4	818	June 25, 2019
[ANN] FluxOptTools Package Announcements optim , flux , visualization	0	1007	July 1, 2019
Gradients of custom functions with Flux Machine Learning flux	2	1209	April 5, 2020
Optim.jl: running instructions after each step General Usage	2	315	January 30, 2020

Is it possible to evaluate the objective function within the "Optimisers.apply!" function?

Related topics