Hmm, so I set x_abstol
and x_reltol
to a larger non-zero value and f_reltol
much smaller. Now Optim reports success even when none of the convergence measures reported are satisfied.
(BTW, I have a different objective now, and inequality constraints, so box minimization won’t cut it, I think)
Code
res = let
objective = θ -> -log_likelihood(θ, df.price, df.info)
# Bounds
lo = [[0, -Inf, -Inf, 1e-10, 1e-10]; zeros(6);]
hi = [[1, Inf, Inf, Inf, Inf]; fill(Inf, 6);]
function stationarity!(c, θ)
p, mu0, mu1, c0, c1, s0, s1, v0, v1, n0, n1 = θ
c[1] = 1 - s0 - v0
c[2] = 1 - s1 - v1
c
end
function stationarity_jac!(jac, θ)
# Diff constr 1
jac[1, 1:5] .= 0 # w.r.t. p, mu0, mu1, c0, c1
jac[1, 6] = -1 # w.r.t. s0
jac[1, 7] = 0 # w.r.t. s1
jac[1, 8] = -1 # w.r.t. v0
jac[1, 9] = 0 # w.r.t. v1
jac[1, 10:end] .= 0 # w.r.t. n0, n1
# Diff constr 2
jac[2, 1:5] .= 0 # w.r.t. p, mu0, mu1, c0, c1
jac[2, 6] = 0 # w.r.t. s0
jac[2, 7] = -1 # w.r.t. s1
jac[2, 8] = 0 # w.r.t. v0
jac[2, 9] = -1 # w.r.t. v1
jac[2, 10:end] .= 0 # w.r.t. n0, n1
jac
end
function stationarity_hess_prod!(h, x, λ)
h .= 0
h
end
# These are equality constraints
constr_lo = constr_hi = [0., 0.]
constr = Optim.TwiceDifferentiableConstraints(
stationarity!, stationarity_jac!, stationarity_hess_prod!,
lo, hi, constr_lo, constr_hi
)
Optim.optimize(
objective, constr, x0, Optim.IPNewton(),
Optim.Options(
iterations=5000,
x_tol=1e-8, f_tol=1e-6,
f_reltol=1e-8, f_abstol=1e-6,
x_reltol=1e-8, x_abstol=1e-6,
show_trace=true, show_every=20
);
autodiff=:forward
)
end
Output
Iter Lagrangian value Function value Gradient norm |==constr.| μ
0 1.755735e+04 -1.150427e+02 9.028653e+03 1.767238e+04 1.09e-04
* time: 1.635028405396512e9
20 1.754711e+04 -1.152350e+02 9.033546e+03 1.766235e+04 6.54e-05
* time: 1.635028405473314e9
40 1.754707e+04 -1.152537e+02 9.033518e+03 1.766232e+04 6.57e-05
* time: 1.635028405484916e9
60 1.754703e+04 -1.152714e+02 9.033493e+03 1.766230e+04 6.59e-05
* time: 1.635028405497786e9
80 1.754702e+04 -1.152913e+02 9.036839e+03 1.766228e+04 6.01e-04
* time: 1.635028405512073e9
* Status: success
* Candidate solution
Final objective value: -1.153089e+02
* Found with
Algorithm: Interior Point Newton
* Convergence measures
|x - x'| = 6.01e-05 ≰ 1.0e-08
|x - x'|/|x'| = 6.95e-05 ≰ 1.0e-08
|f(x) - f(x')| = 1.48e-03 ≰ 1.0e-06
|f(x) - f(x')|/|f(x')| = 1.29e-05 ≰ 1.0e-06
|g(x)| = 2.71e+03 ≰ 1.0e-08
* Work counters
Seconds run: 0 (vs limit Inf)
Iterations: 91
f(x) calls: 618
∇f(x) calls: 618
Also, I didn’t specify any criteria considering the gradient (none of the g_abstol
and g_reltol
options), but the upper bound for |g(x)|
somehow became 1e-8
?
???
Optim.x_converged(res), Optim.f_converged(res), Optim.g_converged(res)
returns (false, true, false)
, so Optim.f_converged(res) == true
, even though none of the reported convergence measures are even remotely close to the desired values.
EDIT: I changed the initial point and got a situation where one of the reported convergence measures is satisfied, but the algorithm reports failure anyway:
* Status: failure (reached maximum number of iterations)
* Candidate solution
Final objective value: -1.289666e+02
* Found with
Algorithm: Interior Point Newton
* Convergence measures
|x - x'| = 3.12e-05 ≰ 1.0e-08
|x - x'|/|x'| = 3.43e-05 ≰ 1.0e-08
|f(x) - f(x')| = 1.04e-04 ≰ 1.0e-06
|f(x) - f(x')|/|f(x')| = 8.07e-07 ≤ 1.0e-06
|g(x)| = 1.42e+03 ≰ 1.0e-08
* Work counters
Seconds run: 3 (vs limit Inf)
Iterations: 5000
f(x) calls: 63273
∇f(x) calls: 63273
What’s going on here? What’s the point of these convergence measures if the algorithm doesn’t seem to take them into consideration?
Sorry, I don’t think I can post a MCVE because I’m doing maximum-likelihood estimation, so creating a MCVE would mean posting all the data, but it’s kinda confidential…