Optim.jl - computing value, gradient and hessian simultaneously


#1

I’ve seen in the documentation of Optim.jl that there is a basic trick to avoid recomputing the same quantity when evaluating a function and its gradient (and potentially also its hessian). The idea is to store whatever is reused in a “buffer array” and use a trick to only update this buffer when needed. However I believe that there are cases where computing value and gradient together is faster than separately, but storing a “buffer” is not the ideal solution. For example, imagine that one has to go over a long for loop where every iteration is more or less complex depending on whether you also need to compute gradient and hessian: if you have to compute value, gradient and hessian together, you don’t want to go over the loop three times.
I was wondering, is there a simple way to tell Optim:

  • if you only need the value, do f
  • if you need value and gradient do g
  • if you need value, grandient and hessian do h

#2

It is possible to create a DifferentiableFunction like:

df = Optim.DifferentiableFunction(f, g!, fg!) 

where fg!(x, g) returns the function value and stores the gradient in g. This can be used in optimization routines.


#3

Thanks a lot! At least in my use case this seems better than the trick proposed in the documentation.
To make sure I understand: in case of methods using second-order derivatives, one can do the same with:

df = Optim.TwiceDifferentiableFunction(f, g!, fg!, h!)

And then:

  • if the algorithm needs f, it will do f
  • if it needs only the gradient it will do g!
  • if it needs f and the gradient it will do fg!
  • if it needs f, gradient and hessian, it will do both fg! and h! (the idea being, I guess, that computing the hessian is reasonably rare and probably much more costly than anything else, so there’s not much point in trying to optimize here)

#4

You can look in the code how it is used https://github.com/JuliaOpt/Optim.jl/blob/addda9ae36044d691b83a12c421b57675b234eeb/src/newton.jl#L30


#5

Didn’t notice this question in December, but check out this part of the NLSolversBase (where the NDifferentiable types live) Readme


#6

Thanks a lot! Will go through it.


#7

I havn’t advertised it too broadly, so please report any bugs you may find!