Optim.jl - computing value, gradient and hessian simultaneously


#1

I’ve seen in the documentation of Optim.jl that there is a basic trick to avoid recomputing the same quantity when evaluating a function and its gradient (and potentially also its hessian). The idea is to store whatever is reused in a “buffer array” and use a trick to only update this buffer when needed. However I believe that there are cases where computing value and gradient together is faster than separately, but storing a “buffer” is not the ideal solution. For example, imagine that one has to go over a long for loop where every iteration is more or less complex depending on whether you also need to compute gradient and hessian: if you have to compute value, gradient and hessian together, you don’t want to go over the loop three times.
I was wondering, is there a simple way to tell Optim:

  • if you only need the value, do f
  • if you need value and gradient do g
  • if you need value, grandient and hessian do h

#2

It is possible to create a DifferentiableFunction like:

df = Optim.DifferentiableFunction(f, g!, fg!) 

where fg!(x, g) returns the function value and stores the gradient in g. This can be used in optimization routines.


#3

Thanks a lot! At least in my use case this seems better than the trick proposed in the documentation.
To make sure I understand: in case of methods using second-order derivatives, one can do the same with:

df = Optim.TwiceDifferentiableFunction(f, g!, fg!, h!)

And then:

  • if the algorithm needs f, it will do f
  • if it needs only the gradient it will do g!
  • if it needs f and the gradient it will do fg!
  • if it needs f, gradient and hessian, it will do both fg! and h! (the idea being, I guess, that computing the hessian is reasonably rare and probably much more costly than anything else, so there’s not much point in trying to optimize here)

#4

You can look in the code how it is used https://github.com/JuliaOpt/Optim.jl/blob/addda9ae36044d691b83a12c421b57675b234eeb/src/newton.jl#L30


#5

Didn’t notice this question in December, but check out this part of the NLSolversBase (where the NDifferentiable types live) Readme


#6

Thanks a lot! Will go through it.


#7

I havn’t advertised it too broadly, so please report any bugs you may find!


#8

I am using Optim.optimize(Optim.only_fg!(fg!), x0, ...) to compute the objective and gradient simultaneously. See https://github.com/JuliaNLSolvers/Optim.jl/issues/637.

How do I add a function for the Hessian? @pkofod


#9
only_fgh!(fgh!)

where fgh! is just like fg!

function fgh!(F, G, H, x)
    ... this part you know
end

#10

Thanks. I would have found this myself but for some reason autocomplete in the REPL doesn’t help. Optim.only_ + [TAB] doesn’t print any suggestions, even though these functions are there.


#11

Happy to help. We need to work on documentation, so thanks for asking. I thought this was mentioned somewhere.


#12

Strange. Maybe a bug due to the “!” in them? Worth a bug report if you can reproduce this consistently.


#13

It’s not the !. Also note that there is a version only_fg without the bang (I’m not sure what it does). And I get autocomplete for Optim.precondprep! (for example), which has a bang. I think only_fg, only_fg!, only_fgh! come from a different package that Optim depends on? Maybe this affects autocomplete.


#14

Yes I believe it’s because it comes from NLSOlversBAse


#15

I got around to actually trying this and I am getting an error. This is on Julia v1.0, Optim v0.17.0.

f(x) = (1.0 - x[1])^2 + 100.0 * (x[2] - x[1]^2)^2
function g!(G, x)
  G[1] = -2.0 * (1.0 - x[1]) - 400.0 * (x[2] - x[1]^2) * x[1]
  G[2] = 200.0 * (x[2] - x[1]^2)
end
function h!(H, x)
  H[1, 1] = 2.0 - 400.0 * x[2] + 1200.0 * x[1]^2
  H[1, 2] = -400.0 * x[1]
  H[2, 1] = -400.0 * x[1]
  H[2, 2] = 200.0
end
function fg!(F,G,x)
  G == nothing || g!(G,x)
  F == nothing || return f(x)
  nothing
end
function fgh!(F,G,H,x)
  G == nothing || g!(G,x)
  H == nothing || h!(H,x)
  F == nothing || return f(x)
  nothing
end

import Optim
Optim.optimize(Optim.only_fg!(fg!), [0., 0.], Optim.LBFGS()) # works fine
Optim.optimize(Optim.only_fgh!(fgh!), [0., 0.], Optim.Newton()) 
# ERROR: MethodError: objects of type NLSolversBase.InplaceObjective{Nothing,Nothing,typeof(fgh!)} are not callable

Here is the stack trace

Stacktrace:
[1] finite_difference_gradient!(::Array{Float64,1}, ::NLSolversBase.InplaceObjective{Nothing,Nothing,typeof(fgh!)}, ::Array{Float64,1}, ::DiffEqDiffTools.GradientCache{Nothing,Nothing,Nothing,Val{:central},Float64,Val{true}}) at /opt/julia-depot/packages/DiffEqDiffTools/jv7Il/src/gradients.jl:282
[2] (::getfield(NLSolversBase, Symbol("#g!#42")){NLSolversBase.InplaceObjective{Nothing,Nothing,typeof(fgh!)},DiffEqDiffTools.GradientCache{Nothing,Nothing,Nothing,Val{:central},Float64,Val{true}}})(::Array{Float64,1}, ::Array{Float64,1}) at /opt/julia-depot/packages/NLSolversBase/Cvvki/src/objective_types/twicedifferentiable.jl:103
[3] (::getfield(NLSolversBase, Symbol("#fg!#43")){NLSolversBase.InplaceObjective{Nothing,Nothing,typeof(fgh!)}})(::Array{Float64,1}, ::Array{Float64,1}) at /opt/julia-depot/packages/NLSolversBase/Cvvki/src/objective_types/twicedifferentiable.jl:107
[4] value_gradient!!(::NLSolversBase.TwiceDifferentiable{Float64,Array{Float64,1},Array{Float64,2},Array{Float64,1}}, ::Array{Float64,1}) at /opt/julia-depot/packages/NLSolversBase/Cvvki/src/interface.jl:88
[5] initial_state(::Optim.Newton{LineSearches.InitialStatic{Float64},LineSearches.HagerZhang{Float64,Base.RefValue{Bool}}}, ::Optim.Options{Float64,Nothing}, ::NLSolversBase.TwiceDifferentiable{Float64,Array{Float64,1},Array{Float64,2},Array{Float64,1}}, ::Array{Float64,1}) at /opt/julia-depot/packages/Optim/fabGe/src/multivariate/solvers/second_order/newton.jl:45
[6] #optimize#87 at /opt/julia-depot/packages/Optim/fabGe/src/multivariate/optimize/optimize.jl:33 [inlined]
[7] optimize(::NLSolversBase.InplaceObjective{Nothing,Nothing,typeof(fgh!)}, ::Array{Float64,1}, ::Optim.Newton{LineSearches.InitialStatic{Float64},LineSearches.HagerZhang{Float64,Base.RefValue{Bool}}}, ::Optim.Options{Float64,Nothing}) at /opt/julia-depot/packages/Optim/fabGe/src/multivariate/optimize/interface.jl:113 (repeats 2 times)
[8] top-level scope at none:0

A similar error is produced even if the Hessian is not required, as in Optim.optimize(Optim.only_fgh!(fgh!), [0., 0.], Optim.LBFGS()). So the problem seems to be inside only_fgh!?

On Julia v0.6.4 with Optim v0.15.3 I get similar errors with only_fgh! (only_fg! works fine)


#16

Huh, thanks. This is a bug in Optim.


#17

I raised an issue


#18

Thank you! Will fix it one of these days. What just surprises me is that I’m actually using this functionality in a private repo, but maybe I’m using an optimize signature that works