Optim.jl - computing value, gradient and hessian simultaneously


I’ve seen in the documentation of Optim.jl that there is a basic trick to avoid recomputing the same quantity when evaluating a function and its gradient (and potentially also its hessian). The idea is to store whatever is reused in a “buffer array” and use a trick to only update this buffer when needed. However I believe that there are cases where computing value and gradient together is faster than separately, but storing a “buffer” is not the ideal solution. For example, imagine that one has to go over a long for loop where every iteration is more or less complex depending on whether you also need to compute gradient and hessian: if you have to compute value, gradient and hessian together, you don’t want to go over the loop three times.
I was wondering, is there a simple way to tell Optim:

  • if you only need the value, do f
  • if you need value and gradient do g
  • if you need value, grandient and hessian do h


It is possible to create a DifferentiableFunction like:

df = Optim.DifferentiableFunction(f, g!, fg!) 

where fg!(x, g) returns the function value and stores the gradient in g. This can be used in optimization routines.


Thanks a lot! At least in my use case this seems better than the trick proposed in the documentation.
To make sure I understand: in case of methods using second-order derivatives, one can do the same with:

df = Optim.TwiceDifferentiableFunction(f, g!, fg!, h!)

And then:

  • if the algorithm needs f, it will do f
  • if it needs only the gradient it will do g!
  • if it needs f and the gradient it will do fg!
  • if it needs f, gradient and hessian, it will do both fg! and h! (the idea being, I guess, that computing the hessian is reasonably rare and probably much more costly than anything else, so there’s not much point in trying to optimize here)


You can look in the code how it is used https://github.com/JuliaOpt/Optim.jl/blob/addda9ae36044d691b83a12c421b57675b234eeb/src/newton.jl#L30


Didn’t notice this question in December, but check out this part of the NLSolversBase (where the NDifferentiable types live) Readme


Thanks a lot! Will go through it.


I havn’t advertised it too broadly, so please report any bugs you may find!


I am using Optim.optimize(Optim.only_fg!(fg!), x0, ...) to compute the objective and gradient simultaneously. See https://github.com/JuliaNLSolvers/Optim.jl/issues/637.

How do I add a function for the Hessian? @pkofod


where fgh! is just like fg!

function fgh!(F, G, H, x)
    ... this part you know


Thanks. I would have found this myself but for some reason autocomplete in the REPL doesn’t help. Optim.only_ + [TAB] doesn’t print any suggestions, even though these functions are there.


Happy to help. We need to work on documentation, so thanks for asking. I thought this was mentioned somewhere.


Strange. Maybe a bug due to the “!” in them? Worth a bug report if you can reproduce this consistently.


It’s not the !. Also note that there is a version only_fg without the bang (I’m not sure what it does). And I get autocomplete for Optim.precondprep! (for example), which has a bang. I think only_fg, only_fg!, only_fgh! come from a different package that Optim depends on? Maybe this affects autocomplete.


Yes I believe it’s because it comes from NLSOlversBAse


I got around to actually trying this and I am getting an error. This is on Julia v1.0, Optim v0.17.0.

f(x) = (1.0 - x[1])^2 + 100.0 * (x[2] - x[1]^2)^2
function g!(G, x)
  G[1] = -2.0 * (1.0 - x[1]) - 400.0 * (x[2] - x[1]^2) * x[1]
  G[2] = 200.0 * (x[2] - x[1]^2)
function h!(H, x)
  H[1, 1] = 2.0 - 400.0 * x[2] + 1200.0 * x[1]^2
  H[1, 2] = -400.0 * x[1]
  H[2, 1] = -400.0 * x[1]
  H[2, 2] = 200.0
function fg!(F,G,x)
  G == nothing || g!(G,x)
  F == nothing || return f(x)
function fgh!(F,G,H,x)
  G == nothing || g!(G,x)
  H == nothing || h!(H,x)
  F == nothing || return f(x)

import Optim
Optim.optimize(Optim.only_fg!(fg!), [0., 0.], Optim.LBFGS()) # works fine
Optim.optimize(Optim.only_fgh!(fgh!), [0., 0.], Optim.Newton()) 
# ERROR: MethodError: objects of type NLSolversBase.InplaceObjective{Nothing,Nothing,typeof(fgh!)} are not callable

Here is the stack trace

[1] finite_difference_gradient!(::Array{Float64,1}, ::NLSolversBase.InplaceObjective{Nothing,Nothing,typeof(fgh!)}, ::Array{Float64,1}, ::DiffEqDiffTools.GradientCache{Nothing,Nothing,Nothing,Val{:central},Float64,Val{true}}) at /opt/julia-depot/packages/DiffEqDiffTools/jv7Il/src/gradients.jl:282
[2] (::getfield(NLSolversBase, Symbol("#g!#42")){NLSolversBase.InplaceObjective{Nothing,Nothing,typeof(fgh!)},DiffEqDiffTools.GradientCache{Nothing,Nothing,Nothing,Val{:central},Float64,Val{true}}})(::Array{Float64,1}, ::Array{Float64,1}) at /opt/julia-depot/packages/NLSolversBase/Cvvki/src/objective_types/twicedifferentiable.jl:103
[3] (::getfield(NLSolversBase, Symbol("#fg!#43")){NLSolversBase.InplaceObjective{Nothing,Nothing,typeof(fgh!)}})(::Array{Float64,1}, ::Array{Float64,1}) at /opt/julia-depot/packages/NLSolversBase/Cvvki/src/objective_types/twicedifferentiable.jl:107
[4] value_gradient!!(::NLSolversBase.TwiceDifferentiable{Float64,Array{Float64,1},Array{Float64,2},Array{Float64,1}}, ::Array{Float64,1}) at /opt/julia-depot/packages/NLSolversBase/Cvvki/src/interface.jl:88
[5] initial_state(::Optim.Newton{LineSearches.InitialStatic{Float64},LineSearches.HagerZhang{Float64,Base.RefValue{Bool}}}, ::Optim.Options{Float64,Nothing}, ::NLSolversBase.TwiceDifferentiable{Float64,Array{Float64,1},Array{Float64,2},Array{Float64,1}}, ::Array{Float64,1}) at /opt/julia-depot/packages/Optim/fabGe/src/multivariate/solvers/second_order/newton.jl:45
[6] #optimize#87 at /opt/julia-depot/packages/Optim/fabGe/src/multivariate/optimize/optimize.jl:33 [inlined]
[7] optimize(::NLSolversBase.InplaceObjective{Nothing,Nothing,typeof(fgh!)}, ::Array{Float64,1}, ::Optim.Newton{LineSearches.InitialStatic{Float64},LineSearches.HagerZhang{Float64,Base.RefValue{Bool}}}, ::Optim.Options{Float64,Nothing}) at /opt/julia-depot/packages/Optim/fabGe/src/multivariate/optimize/interface.jl:113 (repeats 2 times)
[8] top-level scope at none:0

A similar error is produced even if the Hessian is not required, as in Optim.optimize(Optim.only_fgh!(fgh!), [0., 0.], Optim.LBFGS()). So the problem seems to be inside only_fgh!?

On Julia v0.6.4 with Optim v0.15.3 I get similar errors with only_fgh! (only_fg! works fine)


Huh, thanks. This is a bug in Optim.


I raised an issue


Thank you! Will fix it one of these days. What just surprises me is that I’m actually using this functionality in a private repo, but maybe I’m using an optimize signature that works