Is there a Julia equivalent of scipy.optimize.minimize(method='TNC')?

liuyxpp · August 18, 2020, 12:35pm

which is a truncated Newton (TNC) algorithm, see here for details: minimize(method=’TNC’) — SciPy v1.9.1 Manual.

I have a box constrained optimization problem which requires more evaluations with Optim.jl + BFGS (or any other algorithms) than this Python TNC algorithm. Does anyone know if there is a Julia implementation of this algorithm or a binding (similar to the Python one which is just a binding of a C implementation)?

baggepinnen · August 18, 2020, 12:54pm

What does truncated Newton do? You can modify the line search algorithm to Newton in Optim to bound the step size etc. You can also use Newton with a trust region.

johnmyleswhite · August 18, 2020, 1:08pm

Also worth looking into: is Optim.jl trying to hit a higher accuracy threshold that SciPy?

abelsiqueira · August 18, 2020, 1:12pm

Do you mind testing tron from JSOSolvers.jl?

using NLPModels, JSOSolvers
nlp = ADNLPModel(
    x -> (x[1] - 1)^2 + 100 * (x[2] - x[1]^2)^2, # objective
    [0.1; 0.1],  # x0
    zeros(2),    # lvar
    ones(2)      # uvar
)
output = tron(nlp)

liuyxpp · August 18, 2020, 1:15pm

To this respect, I am very confusing about how to set the accuracy. There are x_tol, f_tol, g_tol, but also *_abstol and *_reltol, also outer_x_tol, outer_f_tol, outer_g_tol and there abs and rel counter parts. Which one controls the final accuracy? The doc seems of little help.

liuyxpp · August 18, 2020, 1:19pm

Thanks. I will try it. BTW, atol is for function value or the variables? If atol is for function value, can I control the accuracy of the variables instead?

liuyxpp · August 18, 2020, 1:50pm

I have tried it but realized soon that ADNLPModel using ForwardDiff to compute grad but my objective function is an external C++ function. The C++ function also computes the accompanying grad with the objective function value. So, is there an interface to input user-provided grad?

abelsiqueira · August 18, 2020, 2:22pm

atol and rtol are fot the first-order condition – projected gradient norm. Our other stopping criteria are max_eval and max_time, and it could stop if the line search step is too small.

Regarding user-provided functions, you have to a create a manual type, as shown below. Notice that tron requires obj, grad! and hprod!, so all three have to be implemented.

struct MyProblem <: AbstractNLPModel
  meta :: NLPModelMeta
  counters :: Counters
end

function MyProblem()
  meta = NLPModelMeta(
    2, # nvar
    x0 = [0.1; 0.1],
    lvar=zeros(2),
    uvar=ones(2)
  )
  MyProblem(meta, Counters())
end

function NLPModels.obj(:: MyProblem, x :: AbstractVector)
  (x[1] - 1)^2 + 100 * (x[2] - x[1]^2)^2
end

function NLPModels.grad!(:: MyProblem, x :: AbstractVector, g :: AbstractVector)
  g[1] = 2 * (x[1] - 1) - 400 * x[1] * (x[2] - x[1]^2)
  g[2] = 200 * (x[2] - x[1]^2)
  g
end

function NLPModels.hprod!(:: MyProblem, x :: AbstractVector, v :: AbstractVector, Hv :: AbstractVector; obj_weight=1.0)
  Hv[1] = obj_weight * (2 - 400 * (x[2] - x[1]^2) + 800 * x[1]^2) * v[1] - 400obj_weight * x[1] * v[2]
  Hv[2] = 200obj_weight * v[2] - 400obj_weight * x[1] * v[1]
  Hv
end

nlp = MyProblem()

output = tron(nlp)

liuyxpp · August 18, 2020, 3:06pm

Thanks for the detailed reply. However, it is infeasible to compute the hessian. I am afraid I cannot try your approach.

abelsiqueira · August 18, 2020, 3:46pm

We have an alternative to use LBFGS instead of the Hessian, shown below, but notice that the Truncated Newton method is a second-order method, so the following is not a Newton method anymore. The main difference is we remove hprod and add nlp = LBFGSModel(MyProblemNoHess()).

using NLPModels, JSOSolvers

struct MyProblemNoHess <: AbstractNLPModel
  meta :: NLPModelMeta
  counters :: Counters
end

function MyProblemNoHess()
  meta = NLPModelMeta(
    2, # nvar
    x0 = [0.1; 0.1],
    lvar=zeros(2),
    uvar=ones(2)
  )
  MyProblemNoHess(meta, Counters())
end

function NLPModels.obj(:: MyProblemNoHess, x :: AbstractVector)
  (x[1] - 1)^2 + 100 * (x[2] - x[1]^2)^2
end

function NLPModels.grad!(:: MyProblemNoHess, x :: AbstractVector, g :: AbstractVector)
  g[1] = 2 * (x[1] - 1) - 400 * x[1] * (x[2] - x[1]^2)
  g[2] = 200 * (x[2] - x[1]^2)
  g
end

nlp = LBFGSModel(MyProblemNoHess())

output = tron(nlp)

liuyxpp · August 19, 2020, 12:40am

Yeah, it works! And it requires less evals than Optim.jl. It would be great if you can address following two minor points:

(1) My external C++ funciton is a very expensive function, and it returns the objective value and gradients in a single call:

f, g[1], g[2] = CppFunc(x1, x2, c, model, config)

Can tron take advantage of this special structure of my problem to reduce the function call? Currently, I wirte

function NLPModels.obj(:: MyProblemNoHess, x :: AbstractVector)
  f, _, _ = CppFunc(x[1], x[2], c, model, config)
  return f
end

function NLPModels.grad!(:: MyProblemNoHess, x :: AbstractVector, g :: AbstractVector)
  _, g[1], g[2] = CppFunc(x[1], x[2], c, model, config)
  return g
end

However, if the input x is the same, the funciton shall be evaluated only once to save computation time.

(2) As you can see in the above demo code, I need some extra parameters c, model, config to call the function. I don’t like these variables lives in the global. But when I put the NLPModels.obj and NLPModels.grad! definitions into a function, Julia will complain: ERROR: LoadError: syntax: Global method definition around /path/to/test.jl:273 needs to be placed at the top level, or use “eval”. Is there an elegant way to define these function so they can take local parameters?

abelsiqueira · August 19, 2020, 1:29am

I’m happy to know it’s working, thanks for using it!

(1) Good point. I’ve created a pull request enabling that https://github.com/JuliaSmoothOptimizers/JSOSolvers.jl/pull/41
To enable call tron with keyword use_only_objgrad=true. It will be released on 0.4.3.

(2) This extra information should probably go inside MyProblemNoHess, we do similar things in some of our models.
Untested example:

struct MyProblemNoHess <: AbstractNLPModel
  meta :: NLPModelMeta
  counters :: Counters
  c
  model
  config
end

function MyProblemNoHess()
  meta = NLPModelMeta(
    2, # nvar
    x0 = [0.1; 0.1],
    lvar=zeros(2),
    uvar=ones(2)
  )
  # Get/define/create c, model and config
  MyProblemNoHess(meta, Counters(), c, model, config)
end

function NLPModels.obj(nlp :: MyProblemNoHess, x :: AbstractVector)
  f, _, _ = CppFunc(x[1], x[2], nlp.c, nlp.model, nlp.config)
  return f
end

function NLPModels.grad!(nlp :: MyProblemNoHess, x :: AbstractVector, g :: AbstractVector)
  _, g[1], g[2] = CppFunc(x[1], x[2], nlp.c, nlp.model, nlp.config)
  return g
end

Other examples include AugLagModel in Percival.jl, where we keep the Lagrangian multipliers inside the model as well.

liuyxpp · August 19, 2020, 3:43am

Looking forward to the new version. The approach for the (2) point works great. Thanks for your help!

pkofod · August 19, 2020, 5:37am

Sad to hear that it’s taking longer than TNC (for which I’m assuming you’re then using finite difference hessians?). I supposed it’s not possible to have a look at the actual function? Or even just the output of the different algorithms.

The box constrained optimizer in Optim is… not optimal. It uses a log-barrier term to enforce the constraints and I agree with the observation that it tends to require more evaluations than “necessary”. I intend to add some things here in the future, but I’m happy you found a solution to your problem with JSO

liuyxpp · August 19, 2020, 5:45am

Optim.jl is also a great package. I use it with Brent() for univariate optimization in the same project.

I also want to further test Optim.jl for multivariate optimization. Can you point to me where can I find the details for setting tolerance correctly, especially for box constrained problem using Fminbox, since there are two sets of tolerance tol and outer_tol. If I wish to obtain a solution (input variable x) within a given accuracy (say 1e-3), how can I set the tolerances? Thanks!

abelsiqueira · August 19, 2020, 12:06pm

The new version is out, let me know how it goes.

liuyxpp · August 20, 2020, 1:17pm

I have tested Optim.jl + BFGS again with g_tol = 1e-4 and outer_g_tol = 1e-4 and JSO with atol = 1e-4 . And it seems they both take the same number of evaluations. My previous test of Optim.jl setting x_tol = 1e-4 and outer_x_tol = 1e-4 which requires more evalutions.

liuyxpp · August 20, 2020, 1:48pm

I have tried the new version. However, it seems no improvements by using objgrad!. I suspect that there are too many cases during optimization which evaluate objgrad! for obj or grad value only. Following are the traces. I’ve put a println inside the core function. Each objgrad! has to call F twice to construct both obj and grad values.

result = tron(nlp, use_only_objgrad=true, atol=1e-4)

------------ F IS EVALUATED ONCE ------------
------------ F IS EVALUATED ONCE ------------
------------ F IS EVALUATED ONCE ------------
------------ F IS EVALUATED ONCE ------------
[ Info:   iter      f(x)         π         Δ     ratio         cgstatus  
------------ F IS EVALUATED ONCE ------------
------------ F IS EVALUATED ONCE ------------
------------ F IS EVALUATED ONCE ------------
------------ F IS EVALUATED ONCE ------------
[ Info:      0   5.2e+00   1.2e-01   1.0e+00   8.7e-01  stationary point found
------------ F IS EVALUATED ONCE ------------
------------ F IS EVALUATED ONCE ------------
------------ F IS EVALUATED ONCE ------------
------------ F IS EVALUATED ONCE ------------
[ Info:      1   5.2e+00   1.4e-02   1.2e-01   1.6e+00  stationary point found
------------ F IS EVALUATED ONCE ------------
------------ F IS EVALUATED ONCE ------------
------------ F IS EVALUATED ONCE ------------
------------ F IS EVALUATED ONCE ------------
[ Info:      2   5.2e+00   8.3e-03   1.2e-01   1.2e+00  stationary point found
------------ F IS EVALUATED ONCE ------------
------------ F IS EVALUATED ONCE ------------
------------ F IS EVALUATED ONCE ------------
------------ F IS EVALUATED ONCE ------------
[ Info:      3   5.2e+00   1.6e-03   1.2e-01   1.2e+00  stationary point found
------------ F IS EVALUATED ONCE ------------
------------ F IS EVALUATED ONCE ------------
------------ F IS EVALUATED ONCE ------------
------------ F IS EVALUATED ONCE ------------
[ Info:      4   5.2e+00   2.7e-04   1.2e-01   1.1e+00  stationary point found
[ Info:      5   5.2e+00   2.3e-05   1.2e-01
"Execution stats: first-order stationary"

result = tron(nlp, use_only_objgrad=false, atol=1e-4)

------------ F IS EVALUATED ONCE ------------
------------ F IS EVALUATED ONCE ------------
------------ F IS EVALUATED ONCE ------------
------------ F IS EVALUATED ONCE ------------
[ Info:   iter      f(x)         π         Δ     ratio         cgstatus  
------------ F IS EVALUATED ONCE ------------
------------ F IS EVALUATED ONCE ------------
[ Info:      0   5.2e+00   1.2e-01   1.0e+00   8.7e-01  stationary point found
------------ F IS EVALUATED ONCE ------------
------------ F IS EVALUATED ONCE ------------
------------ F IS EVALUATED ONCE ------------
------------ F IS EVALUATED ONCE ------------
[ Info:      1   5.2e+00   1.4e-02   1.2e-01   1.6e+00  stationary point found
------------ F IS EVALUATED ONCE ------------
------------ F IS EVALUATED ONCE ------------
------------ F IS EVALUATED ONCE ------------
------------ F IS EVALUATED ONCE ------------
[ Info:      2   5.2e+00   8.3e-03   1.2e-01   1.2e+00  stationary point found
------------ F IS EVALUATED ONCE ------------
------------ F IS EVALUATED ONCE ------------
------------ F IS EVALUATED ONCE ------------
------------ F IS EVALUATED ONCE ------------
[ Info:      3   5.2e+00   1.6e-03   1.2e-01   1.2e+00  stationary point found
------------ F IS EVALUATED ONCE ------------
------------ F IS EVALUATED ONCE ------------
------------ F IS EVALUATED ONCE ------------
------------ F IS EVALUATED ONCE ------------
[ Info:      4   5.2e+00   2.7e-04   1.2e-01   1.1e+00  stationary point found
------------ F IS EVALUATED ONCE ------------
------------ F IS EVALUATED ONCE ------------
[ Info:      5   5.2e+00   2.3e-05   1.2e-01
"Execution stats: first-order stationary"

abelsiqueira · August 20, 2020, 2:03pm

I forgot to say you should update your code to provide objgrad! instead of obj and grad!:

function NLPModels.objgrad!(:: MyProblemNoHess, x :: AbstractVector, g :: AbstractVector)
  f, g[1], g[2] = CppFunc(x[1], x[2], c, model, config)
  return f, g
end

abelsiqueira · August 20, 2020, 2:04pm

Otherwise objgrad defaults to calling obj and grad.

Topic		Replies	Views
Trust-region constrained optimization Optimization (Mathematical)	32	1503	January 21, 2023
Optim.jl vs scipy.optimize once again Performance optim	54	8928	June 10, 2021
Making LBFGS in `Optim.jl` as fast as in `NLopt.jl` Optimization (Mathematical)	13	2093	December 6, 2022
Constraints tolerance IPNewton in Optim.jl Optimization (Mathematical) optim	0	271	December 23, 2022
Optim.jl --- Do all Methods Allow Box Constraints? Should all Work Without Them? Optimization (Mathematical)	35	7155	October 3, 2020

Is there a Julia equivalent of scipy.optimize.minimize(method='TNC')?

Related topics