Deprected - Q: Optim.jl | Passing fixed parameters to an analytic derivative [gradient] for univariate [multivariate] optimization?

This question is deprecated and is replaced by Q: Optimization.jl | Passing fixed parameters to an analytic derivative [gradient] for univariate [multivariate] optimization?

Hello,

This question may be considered a continuation of the previous question “How to make Optim.jl’s optimize work for scalars?

With the caveat that “I am not sure you are aware of the possible pitfalls. Curiously, multivariate methods can break down in surprising ways in 1D, and can easily yield suboptimal performance. Optim also has GoldenSection()” [a reference and/or examples for this issue would be appreciated] in mind, I would like to perform derivative [gradient]-based univariate [multivariate] optimization using a function and it’s analytic derivative [gradient], both of which required passing fixed parameters.

With a modification of Tamas_Papp’s MWE, I can currently run optimize() and pass fixed parameters to the function, as in the following MWE, using autodiff to compute the function derivative

# test_scalar_Optim.jl

using Optim

function f(x, p)
    f_x = (x - p[1])^2
    return f_x
end

# How does one pass fixed parameters, p, to an analytic derivative [gradient]?
function g!(G, x, p)
    G[1] = 2.0*(x - p[1])
    return G[1]
end

function univariate_optimize(f, x0, p, args...; kwargs...)
    opt = Optim.optimize(x -> f(x[1], p), [x0], args...; kwargs...)
    @assert Optim.converged(opt)
    Optim.minimizer(opt)[1]
end

begin
    x0 = 3.0
    p = [2.0]
    univariate_optimize(f, x0, p, BFGS(); autodiff = :forward)
end

but, after reading the Optim.jl documentation here and here, I have not been able to figure out how to pass fixed parameters to the analytic derivative in the univariate case, g! in the above MWE, or to the gradient in the multivariate case.

Rationale. As I have the analytic derivative available, I would like to use it for speed and efficiency. The calculation is over a series of time steps, the function minimization is performed at each time step. The minimum will shift by some small amount at each time step, so a good estimate for the startting point of the current minimization is the result of the previous minimization.