Solving an optimal control problem with Optim.jl

I am trying to solve an optimal control problem in Julia. I tried using NLOptControl.jl and JuMP.jl but ran into some difficulties. See this post.

Therefore I am trying to use Optim.jl as an optimizer. Since my optimization function is pretty complicated I cannot calculate the derivatives so I must use algorithms which do not require derivative, use numerical differentiation, or use the automatic forward differentiation.

Nedler Mead

I first tried to use the NelderMead() algorithm and I found out that the solver takes a long time to reach a result, and by looking at the trace of the solution it seems like it not exploring much of the space.

This is an example of the output of 10 iterations:

Iter     Function value    √(Σ(yᵢ-ȳ)²)/n 
------   --------------    --------------
     0    -1.017303e+03     1.043373e+02
 * time: 9.298324584960938e-5
 * step_type: initial
 * centroid: [-4.0, -1.2375]
     1    -1.017303e+03     1.565657e+01
 * time: 0.6034481525421143
 * step_type: reflection
 * centroid: [-4.0, -1.2375]
     2    -1.017303e+03     2.761914e+01
 * time: 1.8348710536956787
 * step_type: reflection
 * centroid: [-3.0125, -1.475]
     3    -1.062787e+03     2.003074e+01
 * time: 2.4888319969177246
 * step_type: reflection
 * centroid: [-3.0125, -1.7125000000000001]
     4    -1.062787e+03     2.770746e+01
 * time: 3.7384161949157715
 * step_type: reflection
 * centroid: [-3.0125, -1.9500000000000002]
     5    -1.091746e+03     1.215436e+01
 * time: 5.045389175415039
 * step_type: inside contraction
 * centroid: [-2.0250000000000004, -2.1875]
     6    -1.091746e+03     4.488257e+00
 * time: 6.355751037597656
 * step_type: outside contraction
 * centroid: [-2.5187500000000003, -2.246875]
     7    -1.093542e+03     2.053544e+00
 * time: 7.638545989990234
 * step_type: outside contraction
 * centroid: [-2.3953125, -2.41015625]
     8    -1.096713e+03     1.903858e+00
 * time: 8.946972131729126
 * step_type: inside contraction
 * centroid: [-2.426171875, -2.4880859375]
     9    -1.098090e+03     9.230324e-01
 * time: 10.217251062393188
 * step_type: inside contraction
 * centroid: [-2.15615234375, -2.518701171875]
    10    -1.098955e+03     3.976319e-01
 * time: 11.521685123443604
 * step_type: inside contraction
 * centroid: [-2.3432373046875, -2.45677490234375]

It seems to me that the solver is converging very fast to some area in the parameter space.
I read in the documentation and saw that NedlerMead is indeed sensitive to initial conditions. Is that the best way to solve the problem? Try many initial conditions and see where the algorithm converges?

Moreover, in the documentation the initial simplex is described to be constructed using the constants a and b, but I couldn’t find how can I configure them. Is there a way to do that? If not, how are they chosen?

I ran this algorithm for a long time, and I never got a convergence. It seems like it is continuing to run in a small area in the parameter space but not really improving. Is there a way to solve this issue? To somehow define the resolution for convergence?

I thought that the parameters f_tol and x_tol (see documentation) are for that. But I get

julia> x0 = [-1.0, -2.0, -1.5];
julia> res = optimize(f, x0, 
    Optim.Options(
        show_trace = true, 
        iterations = 20,
        x_tol=1,
        f_tol=2, 
        allow_f_increases = true))

Iter     Function value    √(Σ(yᵢ-ȳ)²)/n 
------   --------------    --------------
     0    -9.966614e+00     2.430512e-01
 * time: 7.915496826171875e-5
     1    -9.966614e+00     2.608484e-01
 * time: 1.9279539585113525
     2    -1.052182e+01     2.385836e-01
 * time: 2.8761539459228516
     3    -1.052182e+01     2.080585e-01
 * time: 4.82260799407959
     4    -1.052182e+01     2.578224e-01
 * time: 6.8006391525268555
     5    -1.073959e+01     2.280237e-01
 * time: 8.717095136642456
     6    -1.080202e+01     1.604481e-01
 * time: 10.646085023880005
     7    -1.096993e+01     1.408768e-01
 * time: 12.670480012893677
     8    -1.109795e+01     1.218332e-01
 * time: 14.670469045639038
     9    -1.109956e+01     6.590642e-02
 * time: 16.696977138519287
    10    -1.114754e+01     2.033071e-02
 * time: 17.736495971679688
    11    -1.114754e+01     3.135519e-02
 * time: 19.790922164916992
    12    -1.117611e+01     2.515238e-02
 * time: 20.841745138168335
    13    -1.117611e+01     2.587203e-02
 * time: 22.871349096298218
    14    -1.120404e+01     2.175567e-02
 * time: 24.820019006729126
    15    -1.120404e+01     1.100862e-02
 * time: 26.80953598022461
    16    -1.120404e+01     5.808911e-03
 * time: 28.815988063812256
    17    -1.121202e+01     3.990103e-03
 * time: 29.812538146972656
    18    -1.121202e+01     6.873401e-03
 * time: 31.818549156188965
    19    -1.122275e+01     5.271104e-03
 * time: 33.83030414581299
    20    -1.122275e+01     3.908043e-03
 * time: 35.88078594207764

 * Status: failure (reached maximum number of iterations) (line search failed)

 * Candidate solution
    Minimizer: [-3.08e+00, -1.72e+00, -3.67e+00]
    Minimum:   -1.122472e+01

 * Found with
    Algorithm:     Nelder-Mead
    Initial Point: [-1.00e+00, -2.00e+00, -1.50e+00]

 * Convergence measures
    √(Σ(yᵢ-ȳ)²)/n ≰ 1.0e-08

 * Work counters
    Seconds run:   36  (vs limit Inf)
    Iterations:    20
    f(x) calls:    42

So it seems that the value of f in the last iterations is changing by less than f_tol=2.

(L-)BFGS

I also am trying to use LBFGS. I have implemented my code in a way that (hopefully) can manage the automatic forward differentiation, and am trying to run the code with and without autodiff = :forward. The algorithm has been running for a long time (more than 10 hours) and I think it is still differentiating since I put an iterations limit to 5.

Is there a way to fix this issue in order to use this algorithm?

Other algorithms?

I am new to solving optimization problems, and I am wondering: are other approaches I should take here?

Enlarging my parameter space

I am now trying to solve the problem of low number of parameters (i.e. my x0 array is O(1). I am willing to enlarge it to be O(100) at the end.

P.S.

I have posted this question about the same optimal control problem but using JuMP.jl. But since my questions here are specific of Optim.jl I thought I should open a different thread.

Optimal control problems tend to be very high dimensional, but also very structured. Optim does not really exploit this structure out of the box, so it’s very likely going to be suboptimal. Have you had a look at SparseDiffTools.jl?

Thanks for the pointer. From what I understand, SparseDiffTools.jl might provide a fast calculation for the Jacobian, and then I can input that into the algorithms in Optim.jl?

I read the docs, but I do not understand yet how to use this e.g. in Optim.jl.

Have you tried any of the reverse differentiation packages? If it’s a typical optimal control problem, you have one cost value but a ton of control inputs, which is a situation where reverse mode usually works better. I’ve had a lot of success with Reversediff.jl previously.

If that’s the issue you’re thinking about, then local optimization methods are not your solution. Use the particle swarm method of Optim or BlackBoxOptim.jl if you want good global results.

3 Likes