# Is there a Julia equivalent of scipy.optimize.minimize(method='TNC')?

which is a truncated Newton (TNC) algorithm, see here for details: https://docs.scipy.org/doc/scipy/reference/optimize.minimize-tnc.html#optimize-minimize-tnc.

I have a box constrained optimization problem which requires more evaluations with Optim.jl + BFGS (or any other algorithms) than this Python TNC algorithm. Does anyone know if there is a Julia implementation of this algorithm or a binding (similar to the Python one which is just a binding of a C implementation)?

What does truncated Newton do? You can modify the line search algorithm to Newton in Optim to bound the step size etc. You can also use Newton with a trust region.

1 Like

Also worth looking into: is Optim.jl trying to hit a higher accuracy threshold that SciPy?

1 Like

Do you mind testing `tron` from JSOSolvers.jl?

``````using NLPModels, JSOSolvers
x -> (x[1] - 1)^2 + 100 * (x[2] - x[1]^2)^2, # objective
[0.1; 0.1],  # x0
zeros(2),    # lvar
ones(2)      # uvar
)
output = tron(nlp)
``````
3 Likes

To this respect, I am very confusing about how to set the accuracy. There are x_tol, f_tol, g_tol, but also *_abstol and *_reltol, also outer_x_tol, outer_f_tol, outer_g_tol and there abs and rel counter parts. Which one controls the final accuracy? The doc seems of little help.

Thanks. I will try it. BTW, atol is for function value or the variables? If atol is for function value, can I control the accuracy of the variables instead?

I have tried it but realized soon that ADNLPModel using ForwardDiff to compute grad but my objective function is an external C++ function. The C++ function also computes the accompanying grad with the objective function value. So, is there an interface to input user-provided grad?

`atol` and `rtol` are fot the first-order condition â€“ projected gradient norm. Our other stopping criteria are `max_eval` and `max_time`, and it could stop if the line search step is too small.

Regarding user-provided functions, you have to a create a manual type, as shown below. Notice that `tron` requires `obj`, `grad!` and `hprod!`, so all three have to be implemented.

``````struct MyProblem <: AbstractNLPModel
meta :: NLPModelMeta
counters :: Counters
end

function MyProblem()
meta = NLPModelMeta(
2, # nvar
x0 = [0.1; 0.1],
lvar=zeros(2),
uvar=ones(2)
)
MyProblem(meta, Counters())
end

function NLPModels.obj(:: MyProblem, x :: AbstractVector)
(x[1] - 1)^2 + 100 * (x[2] - x[1]^2)^2
end

function NLPModels.grad!(:: MyProblem, x :: AbstractVector, g :: AbstractVector)
g[1] = 2 * (x[1] - 1) - 400 * x[1] * (x[2] - x[1]^2)
g[2] = 200 * (x[2] - x[1]^2)
g
end

function NLPModels.hprod!(:: MyProblem, x :: AbstractVector, v :: AbstractVector, Hv :: AbstractVector; obj_weight=1.0)
Hv[1] = obj_weight * (2 - 400 * (x[2] - x[1]^2) + 800 * x[1]^2) * v[1] - 400obj_weight * x[1] * v[2]
Hv[2] = 200obj_weight * v[2] - 400obj_weight * x[1] * v[1]
Hv
end

nlp = MyProblem()

output = tron(nlp)
``````
2 Likes

Thanks for the detailed reply. However, it is infeasible to compute the hessian. I am afraid I cannot try your approach.

We have an alternative to use LBFGS instead of the Hessian, shown below, but notice that the Truncated Newton method is a second-order method, so the following is not a Newton method anymore. The main difference is we remove `hprod` and add `nlp = LBFGSModel(MyProblemNoHess())`.

``````using NLPModels, JSOSolvers

struct MyProblemNoHess <: AbstractNLPModel
meta :: NLPModelMeta
counters :: Counters
end

function MyProblemNoHess()
meta = NLPModelMeta(
2, # nvar
x0 = [0.1; 0.1],
lvar=zeros(2),
uvar=ones(2)
)
MyProblemNoHess(meta, Counters())
end

function NLPModels.obj(:: MyProblemNoHess, x :: AbstractVector)
(x[1] - 1)^2 + 100 * (x[2] - x[1]^2)^2
end

function NLPModels.grad!(:: MyProblemNoHess, x :: AbstractVector, g :: AbstractVector)
g[1] = 2 * (x[1] - 1) - 400 * x[1] * (x[2] - x[1]^2)
g[2] = 200 * (x[2] - x[1]^2)
g
end

nlp = LBFGSModel(MyProblemNoHess())

output = tron(nlp)
``````
3 Likes

Yeah, it works! And it requires less evals than Optim.jl. It would be great if you can address following two minor points:

(1) My external C++ funciton is a very expensive function, and it returns the objective value and gradients in a single call:

``````f, g[1], g[2] = CppFunc(x1, x2, c, model, config)
``````

Can tron take advantage of this special structure of my problem to reduce the function call? Currently, I wirte

``````function NLPModels.obj(:: MyProblemNoHess, x :: AbstractVector)
f, _, _ = CppFunc(x[1], x[2], c, model, config)
return f
end

function NLPModels.grad!(:: MyProblemNoHess, x :: AbstractVector, g :: AbstractVector)
_, g[1], g[2] = CppFunc(x[1], x[2], c, model, config)
return g
end
``````

However, if the input x is the same, the funciton shall be evaluated only once to save computation time.

(2) As you can see in the above demo code, I need some extra parameters `c, model, config` to call the function. I donâ€™t like these variables lives in the global. But when I put the `NLPModels.obj` and `NLPModels.grad!` definitions into a function, Julia will complain: ERROR: LoadError: syntax: Global method definition around /path/to/test.jl:273 needs to be placed at the top level, or use â€śevalâ€ť. Is there an elegant way to define these function so they can take local parameters?

Iâ€™m happy to know itâ€™s working, thanks for using it!

(1) Good point. Iâ€™ve created a pull request enabling that https://github.com/JuliaSmoothOptimizers/JSOSolvers.jl/pull/41
To enable call `tron` with keyword `use_only_objgrad=true`. It will be released on `0.4.3`.

(2) This extra information should probably go inside `MyProblemNoHess`, we do similar things in some of our models.
Untested example:

``````struct MyProblemNoHess <: AbstractNLPModel
meta :: NLPModelMeta
counters :: Counters
c
model
config
end

function MyProblemNoHess()
meta = NLPModelMeta(
2, # nvar
x0 = [0.1; 0.1],
lvar=zeros(2),
uvar=ones(2)
)
# Get/define/create c, model and config
MyProblemNoHess(meta, Counters(), c, model, config)
end

function NLPModels.obj(nlp :: MyProblemNoHess, x :: AbstractVector)
f, _, _ = CppFunc(x[1], x[2], nlp.c, nlp.model, nlp.config)
return f
end

function NLPModels.grad!(nlp :: MyProblemNoHess, x :: AbstractVector, g :: AbstractVector)
_, g[1], g[2] = CppFunc(x[1], x[2], nlp.c, nlp.model, nlp.config)
return g
end
``````

Other examples include `AugLagModel` in Percival.jl, where we keep the Lagrangian multipliers inside the model as well.

2 Likes

Looking forward to the new version. The approach for the (2) point works great. Thanks for your help!

Sad to hear that itâ€™s taking longer than TNC (for which Iâ€™m assuming youâ€™re then using finite difference hessians?). I supposed itâ€™s not possible to have a look at the actual function? Or even just the output of the different algorithms.

The box constrained optimizer in Optim isâ€¦ not optimal. It uses a log-barrier term to enforce the constraints and I agree with the observation that it tends to require more evaluations than â€śnecessaryâ€ť. I intend to add some things here in the future, but Iâ€™m happy you found a solution to your problem with JSO

2 Likes

Optim.jl is also a great package. I use it with Brent() for univariate optimization in the same project.

I also want to further test Optim.jl for multivariate optimization. Can you point to me where can I find the details for setting tolerance correctly, especially for box constrained problem using Fminbox, since there are two sets of tolerance tol and outer_tol. If I wish to obtain a solution (input variable x) within a given accuracy (say 1e-3), how can I set the tolerances? Thanks!

The new version is out, let me know how it goes.

I have tested Optim.jl + BFGS again with `g_tol = 1e-4` and `outer_g_tol = 1e-4` and JSO with `atol = 1e-4` . And it seems they both take the same number of evaluations. My previous test of Optim.jl setting `x_tol = 1e-4` and `outer_x_tol = 1e-4` which requires more evalutions.

I have tried the new version. However, it seems no improvements by using objgrad!. I suspect that there are too many cases during optimization which evaluate objgrad! for obj or grad value only. Following are the traces. Iâ€™ve put a println inside the core function. Each objgrad! has to call F twice to construct both obj and grad values.

``````------------ F IS EVALUATED ONCE ------------
------------ F IS EVALUATED ONCE ------------
------------ F IS EVALUATED ONCE ------------
------------ F IS EVALUATED ONCE ------------
[ Info:   iter      f(x)         Ď€         Î”     ratio         cgstatus
------------ F IS EVALUATED ONCE ------------
------------ F IS EVALUATED ONCE ------------
------------ F IS EVALUATED ONCE ------------
------------ F IS EVALUATED ONCE ------------
[ Info:      0   5.2e+00   1.2e-01   1.0e+00   8.7e-01  stationary point found
------------ F IS EVALUATED ONCE ------------
------------ F IS EVALUATED ONCE ------------
------------ F IS EVALUATED ONCE ------------
------------ F IS EVALUATED ONCE ------------
[ Info:      1   5.2e+00   1.4e-02   1.2e-01   1.6e+00  stationary point found
------------ F IS EVALUATED ONCE ------------
------------ F IS EVALUATED ONCE ------------
------------ F IS EVALUATED ONCE ------------
------------ F IS EVALUATED ONCE ------------
[ Info:      2   5.2e+00   8.3e-03   1.2e-01   1.2e+00  stationary point found
------------ F IS EVALUATED ONCE ------------
------------ F IS EVALUATED ONCE ------------
------------ F IS EVALUATED ONCE ------------
------------ F IS EVALUATED ONCE ------------
[ Info:      3   5.2e+00   1.6e-03   1.2e-01   1.2e+00  stationary point found
------------ F IS EVALUATED ONCE ------------
------------ F IS EVALUATED ONCE ------------
------------ F IS EVALUATED ONCE ------------
------------ F IS EVALUATED ONCE ------------
[ Info:      4   5.2e+00   2.7e-04   1.2e-01   1.1e+00  stationary point found
[ Info:      5   5.2e+00   2.3e-05   1.2e-01
"Execution stats: first-order stationary"
``````

``````------------ F IS EVALUATED ONCE ------------
------------ F IS EVALUATED ONCE ------------
------------ F IS EVALUATED ONCE ------------
------------ F IS EVALUATED ONCE ------------
[ Info:   iter      f(x)         Ď€         Î”     ratio         cgstatus
------------ F IS EVALUATED ONCE ------------
------------ F IS EVALUATED ONCE ------------
[ Info:      0   5.2e+00   1.2e-01   1.0e+00   8.7e-01  stationary point found
------------ F IS EVALUATED ONCE ------------
------------ F IS EVALUATED ONCE ------------
------------ F IS EVALUATED ONCE ------------
------------ F IS EVALUATED ONCE ------------
[ Info:      1   5.2e+00   1.4e-02   1.2e-01   1.6e+00  stationary point found
------------ F IS EVALUATED ONCE ------------
------------ F IS EVALUATED ONCE ------------
------------ F IS EVALUATED ONCE ------------
------------ F IS EVALUATED ONCE ------------
[ Info:      2   5.2e+00   8.3e-03   1.2e-01   1.2e+00  stationary point found
------------ F IS EVALUATED ONCE ------------
------------ F IS EVALUATED ONCE ------------
------------ F IS EVALUATED ONCE ------------
------------ F IS EVALUATED ONCE ------------
[ Info:      3   5.2e+00   1.6e-03   1.2e-01   1.2e+00  stationary point found
------------ F IS EVALUATED ONCE ------------
------------ F IS EVALUATED ONCE ------------
------------ F IS EVALUATED ONCE ------------
------------ F IS EVALUATED ONCE ------------
[ Info:      4   5.2e+00   2.7e-04   1.2e-01   1.1e+00  stationary point found
------------ F IS EVALUATED ONCE ------------
------------ F IS EVALUATED ONCE ------------
[ Info:      5   5.2e+00   2.3e-05   1.2e-01
"Execution stats: first-order stationary"
``````

I forgot to say you should update your code to provide `objgrad!` instead of `obj` and `grad!`:

``````function NLPModels.objgrad!(:: MyProblemNoHess, x :: AbstractVector, g :: AbstractVector)
f, g[1], g[2] = CppFunc(x[1], x[2], c, model, config)
return f, g
end
``````
1 Like

Otherwise `objgrad` defaults to calling `obj` and `grad`.

1 Like