How to use ForwardDiff with NLopt?

I’m working on an optimization problem, and I’m trying to make use of the gradient-based algorithms in the NLopt library (specifically LD_SLSQP). The objective function which I am trying to minimize does not have an analytic form (evaluating it involves computing the numerical solution of a system of ODEs), so the gradient must be computed numerically.

I’d like to compute the gradient using automatic differentiation with the ForwardDiff package, but I’m not sure exactly how to do this. My question is: what is the correct way to supply a gradient to an NLopt algorithm which is computed via ForwardDiff?

In this tutorial JuliaOpt/NLopt.jl, it says the gradient must be modified in place, e.g.

using NLopt

function myfunc(x::Vector, grad::Vector)
    if length(grad) > 0
        grad[1] = 0
        grad[2] = 0.5/sqrt(x[2])
    return sqrt(x[2])

But ForwardDiff needs to evaluate myfunc in order to calculate grad, so I don’t see how this would work using in place modification. Also, the FD documentation here ForwardDiff.jl/stable/user/limitations/ says that the target function must be unary (only one argument), while NLopt requires that myfunc() have two: x and grad.

Is there a way to make this work? Or do I need to use a different optimization algorithm/different AD algorithm?

I think the JuMP interface does not require you to explicitly calculate derivatives. You set up the problem, select NLopt as the backend and JuMP does the job / derivatives for you.

You can wrap your function in one which NLopt will like, something like this:

function nlfunc(vec, grad)
    if length(grad) != 0
        ForwardDiff.gradient!(grad, myfun, vec)
nlname = NLopt.Opt(opt, length(start_vec))
min_objective!(nlname, nlfunc)

You could use DiffResult things so as not to repeat the forward evaluation when it’s already been done for the gradient.


Are you sure about this?

DifferentialEquations.jl has long been AD friendly, and for many ODEs even implements adjoint sensitivity methods that should be faster than AD. At this point, the capabilities cover a huge array of problems, so I’d be surprised if your problem is so hairy you need to supply a gradient yourself.

Also see DiffEqFlux.jl, which lets (EDIT:) you throw ODEs into optimization, even if they’re chained with neural nets and stuff. Often you just set how you want your sensitivities with a sensealg parameter, e.g. AD backprop or adjoints computed within the ODE solver. For your particular problem, maybe you only need to specify ForwardDiffSensitivity and you’re done. Or if you really want to supply everything manually, you might only need the vector-Jacobian product, which is also automatic by default but can be supplied manually.

1 Like

Thanks for the suggestion, though unfortunately this didn’t work–the algorithm stopped after the first iteration.

Actually I’m confused as to how this could work; since the grad argument is not initialized, it would have a length of 0 to begin with, and so ForwardDiff.gradient! would never get called. Or am I misunderstanding?

It is entirely possible I messed it up, but the rough idea is like that. NLopt will create an array and pass it to nlfunc, which must write the gradient into it. It may choose not to pass this at every step. I think it also creates vec, rather than re-using the start_vec you pass to NLopt.optimize(nlname, start_vec). Here myfun is the function you are actually minimising. Perhaps grad .= ForwardDiff.gradient(myfun, vec) is simpler.

Optim handles more of this for you, but not sure it has the same algorithm:

od = OnceDifferentiable(myfun, start_vec, autodiff=:forward)
Optim.optimize(od, start_vec, LBFGS())
1 Like

Thanks for your help! It turns out I made a mistake: I forgot to remove the grad argument from my_func. After fixing this, I tried it again with the SLSQP it works! There is one minor issue though: I’m printing the objective function value every 20 or so iterations to display the progress, but it prints all this junk each time:

Dual{ForwardDiff.Tag{var"#objective#101"{typeof(One_Age_Model_1eta),typeof(f_ICs),typeof(norm1),DataFrame,Array{String,1},Array{String,1},Array{Float64,1},Array{Union{Missing, Float64},1},Array{Union{Missing, Float64},1},Array{Float64,1},Array{Float64,1},Array{Int64,1},Array{Int64,1},Array{Int64,1},Array{Int64,1},Int64,Int64,Int64},Float64}}(1.196920482884103e-5,-1.4501145807288991e-5,-2.06553122073125e-5,-6.317391855162725e-7,1.4807120422188783e-7,5.411123230817588e-6,5.823810564322322e-6,3.606300957515629e-7,-3.351628923995338e-6,-2.757480190291689e-6,2.4292065517195563e-6)

Is there a way to suppress this?

Edit: I read the docs and learned of the show_trace option, so I resolved the above issue!

Hi, thanks a lot for your reply. I am brand new to AD so I never knew about these tools–they look extremely cool! I will look into them more, as I’m sure they would be useful for my code.

Hi Michael! I have the same problem as you in that I am using autodiff::foward and SLSQP to some a complex minimization problem. In my situation it is also getting stuck after the first iteration and not giving a result, but never moving on from there. You refer to removing the grad argument from my_func, what do you mean by this? Also did you have to have a constraint for this to work. Thank you so much for your help, I am really struggling which this.

Hi @annabelle_Farnworth, can you please make a new topic with a reproducible example of your problem? We try to avoid posting on older threads after they’ve been solved.

p.s. If you haven’t already, take a read of Please read: make it easier to help you