# How to use specific convergence measures in Optim.jl?

I’m using this code:

``````results = Optim.optimize(
obj, Optim.TwiceDifferentiableConstraints(lo, hi), θ0,
Optim.IPNewton()
)
``````

And getting this output:

`````` * Status: success

* Candidate solution
Final objective value:     -3.375589e+01

* Found with
Algorithm:     Interior Point Newton

* Convergence measures
|x - x'|               = 0.00e+00 ≤ 0.0e+00
|x - x'|/|x'|          = 0.00e+00 ≤ 0.0e+00
|f(x) - f(x')|         = 1.02e+00 ≰ 0.0e+00
|f(x) - f(x')|/|f(x')| = 3.01e-02 ≰ 0.0e+00
|g(x)|                 = 7.72e+04 ≰ 1.0e-08

* Work counters
Seconds run:   7  (vs limit Inf)
Iterations:    5
f(x) calls:    94
∇f(x) calls:   94
``````

As you can see, only the “algorithm looks stuck in the `x` domain” measures are satisfied, yet `|f(x) - f(x')|` and `|f(x) - f(x')|/|f(x')|` are really off. (I think the gradient being really long (`|g(x)| = 7.72e+04`) is OK in constrained problems?)

Is it possible to use only a subset of these convergence measures? Say, I want to make sure that `|f(x) - f(x')|/|f(x')| ≤ 0.0e+00` is always satisfied. How do I do that?

It looks you might be looking to set `x_abstol` and `x_reltol` to a larger non-zero value and `f_reltol` much smaller.

Also verify that `g_converged` is `true`: https://julianlsolvers.github.io/Optim.jl/stable/#user/minimization/#notes-on-convergence-flags-and-checks

In general, you have these options available (cf. Optim.jl), but note the comment: `values 0 and NaN indicate unlimited` :

``````help?> Optim.Options
Configurable options with defaults (values 0 and NaN indicate unlimited):

x_abstol::Real = 0.0,
x_reltol::Real = 0.0,
f_abstol::Real = 0.0,
f_reltol::Real = 0.0,
g_abstol::Real = 1e-8,
g_reltol::Real = 1e-8,
...
``````

Lastly, it is hard to help without a working example. Inferring from the code, you have a single objective function without explicit gradient information and a box constraint, so the better choice might be to simply use

``````inner_optimizer = Optim.LBFGS()
results = Optim.optimize(
obj, lo, hi, θ0, Optim.Fminbox(inner_optimizer)
)
``````
2 Likes

Hmm, so I set `x_abstol` and `x_reltol` to a larger non-zero value and `f_reltol` much smaller. Now Optim reports success even when none of the convergence measures reported are satisfied.

(BTW, I have a different objective now, and inequality constraints, so box minimization won’t cut it, I think)

## Code

``````res = let
objective = θ -> -log_likelihood(θ, df.price, df.info)

# Bounds
lo = [[0, -Inf, -Inf, 1e-10, 1e-10]; zeros(6);]
hi = [[1, Inf, Inf, Inf, Inf]; fill(Inf, 6);]

function stationarity!(c, θ)
p, mu0, mu1, c0, c1, s0, s1, v0, v1, n0, n1 = θ

c = 1 - s0 - v0
c = 1 - s1 - v1
c
end

function stationarity_jac!(jac, θ)
# Diff constr 1
jac[1, 1:5] .= 0 # w.r.t. p, mu0, mu1, c0, c1
jac[1, 6] = -1  # w.r.t. s0
jac[1, 7] = 0   # w.r.t. s1
jac[1, 8] = -1  # w.r.t. v0
jac[1, 9] = 0   # w.r.t. v1
jac[1, 10:end] .= 0  # w.r.t. n0, n1

# Diff constr 2
jac[2, 1:5] .= 0 # w.r.t. p, mu0, mu1, c0, c1
jac[2, 6] = 0   # w.r.t. s0
jac[2, 7] = -1  # w.r.t. s1
jac[2, 8] = 0   # w.r.t. v0
jac[2, 9] = -1  # w.r.t. v1
jac[2, 10:end] .= 0  # w.r.t. n0, n1

jac
end

function stationarity_hess_prod!(h, x, λ)
h .= 0
h
end

# These are equality constraints
constr_lo = constr_hi = [0., 0.]

constr = Optim.TwiceDifferentiableConstraints(
stationarity!, stationarity_jac!, stationarity_hess_prod!,
lo, hi, constr_lo, constr_hi
)

Optim.optimize(
objective, constr, x0, Optim.IPNewton(),
Optim.Options(
iterations=5000,
x_tol=1e-8, f_tol=1e-6,
f_reltol=1e-8, f_abstol=1e-6,
x_reltol=1e-8, x_abstol=1e-6,
show_trace=true, show_every=20
);
autodiff=:forward
)
end
``````

## Output

``````Iter     Lagrangian value Function value   Gradient norm    |==constr.|      μ
0   1.755735e+04     -1.150427e+02    9.028653e+03     1.767238e+04     1.09e-04
* time: 1.635028405396512e9
20   1.754711e+04     -1.152350e+02    9.033546e+03     1.766235e+04     6.54e-05
* time: 1.635028405473314e9
40   1.754707e+04     -1.152537e+02    9.033518e+03     1.766232e+04     6.57e-05
* time: 1.635028405484916e9
60   1.754703e+04     -1.152714e+02    9.033493e+03     1.766230e+04     6.59e-05
* time: 1.635028405497786e9
80   1.754702e+04     -1.152913e+02    9.036839e+03     1.766228e+04     6.01e-04
* time: 1.635028405512073e9
* Status: success

* Candidate solution
Final objective value:     -1.153089e+02

* Found with
Algorithm:     Interior Point Newton

* Convergence measures
|x - x'|               = 6.01e-05 ≰ 1.0e-08
|x - x'|/|x'|          = 6.95e-05 ≰ 1.0e-08
|f(x) - f(x')|         = 1.48e-03 ≰ 1.0e-06
|f(x) - f(x')|/|f(x')| = 1.29e-05 ≰ 1.0e-06
|g(x)|                 = 2.71e+03 ≰ 1.0e-08

* Work counters
Seconds run:   0  (vs limit Inf)
Iterations:    91
f(x) calls:    618
∇f(x) calls:   618
``````

Also, I didn’t specify any criteria considering the gradient (none of the `g_abstol` and `g_reltol` options), but the upper bound for `|g(x)|` somehow became `1e-8`?

# ???

`Optim.x_converged(res), Optim.f_converged(res), Optim.g_converged(res)` returns `(false, true, false)`, so `Optim.f_converged(res) == true`, even though none of the reported convergence measures are even remotely close to the desired values.

EDIT: I changed the initial point and got a situation where one of the reported convergence measures is satisfied, but the algorithm reports failure anyway:

`````` * Status: failure (reached maximum number of iterations)

* Candidate solution
Final objective value:     -1.289666e+02

* Found with
Algorithm:     Interior Point Newton

* Convergence measures
|x - x'|               = 3.12e-05 ≰ 1.0e-08
|x - x'|/|x'|          = 3.43e-05 ≰ 1.0e-08
|f(x) - f(x')|         = 1.04e-04 ≰ 1.0e-06
|f(x) - f(x')|/|f(x')| = 8.07e-07 ≤ 1.0e-06
|g(x)|                 = 1.42e+03 ≰ 1.0e-08

* Work counters
Seconds run:   3  (vs limit Inf)
Iterations:    5000
f(x) calls:    63273
∇f(x) calls:   63273
``````

What’s going on here? What’s the point of these convergence measures if the algorithm doesn’t seem to take them into consideration?

Sorry, I don’t think I can post a MCVE because I’m doing maximum-likelihood estimation, so creating a MCVE would mean posting all the data, but it’s kinda confidential…

The extra information and testing is useful but not conclusive.
(Keeping in mind that I am not well-versed in the full Optim.jl design but…)

Note that `x_tol` and `x_abstol` are apparently equivalent settings, with it preferable only to set one of them, such as `x_abstol`, since `x_tol` will overwrite it (as seen in your example), similarly `f_tol` and `f_reltol` (note the `rel`) are equivalent with the latter best to set directly; also `g_tol` and `g_abstol`.

It seems that the `IPNewton()` method just ignores absolute function differences i.e. `f_abstol` is irrelevant:

as similarly `x_reltol` is also silently ignored in favour of `x_abstol`:

Also, the convergence criteria for this method is either as satisfaction of `x_abstol`, or `g_abstol` (aka `g_tol`), or `f_reltol` is satisfied by successive absolute function convergence:

It may be that you are getting convergence due to successive close function values as seen in the last line 69, and here:

The note specific to `IPNewton()` says:

Note: For constrained optimization problems, we recommend
always enabling `allow_f_increases` and `successive_f_tol` in the options passed to `optimize`.
The default is set to `Optim.Options(allow_f_increases = true, successive_f_tol = 2)`.

Hence you can try out setting those above in your `Options` but also try setting `successive_f_tol` to a higher number (5 or 10); and set `show_every=1` to really get a sense of what is changing at the end.

Lastly, since you are wanting to do constrained optimization, I can highly recommend that you consider using JuMP with the Ipopt solver and follow this MLE example:
https://jump.dev/JuMP.jl/stable/tutorials/Nonlinear%20programs/mle/

1 Like