Gradient rise was obtained by optim.jl package optimization

F-YF · October 11, 2021, 9:07am

When I was doing optimization calculations with the Optim package[Optim.jl], , I found that the objective function sometimes unexpectedly tended to go up. I don’t know why this happens, hope someone can answer for me, thank you very much. The details are as follows:

The code is called:

res = optimize(Optim.only_fg!(fg_Function!), 
                            InitialF, LBFGS(), inplace = false,
                            Optim.Options(x_abstol = 0.0, x_reltol = 0.0, f_abstol = 0.0, f_reltol = 0.0,
                            g_abstol = 0.0, g_reltol = 0.0, show_trace = true, iterations = OptIter, store_trace = true))

Fg_Function! contains the objective function and its derivatives, which have been tested without problem. The output is as follows:

****** Start iteration ******
F= -4.5092727954675236 
Iter     Function value   Gradient norm 
     0    -4.509273e+00     3.392928e-05
 * time: 5.602836608886719e-5
F= -4.509272777421474
F= -4.5092727929444236
     1    -4.509273e+00     2.752020e-05
 * time: 2270.425837993622
F= -4.509272792529507 
F= -4.5092727938377273 
F= -4.50927279939761 
F= -4.5092727621328796 
F= -4.509272799025066 
     2    -4.509273e+00     7.634763e-05
 * time: 8109.941900014877
F= -4.509272810926075 
F= -4.5092726943995975 
F= -4.509272808015269 
     3    -4.509273e+00     4.701028e-05
 * time: 11601.862957000732
F= -4.5092728164806513 
F= -4.5092728412304868 
F= -4.5092726582391403 
F= -4.5092728448386152 
     4    -4.509273e+00     1.918509e-04
 * time: 16263.132133960724
F= -4.5092728623030904 
F= -4.5092727831721384 
F= -4.50927286394497 
     5    -4.509273e+00     6.076972e-05
 * time: 19799.04483485222
F= -4.509272865883659 
F= -4.509272869318157 
F= -4.509272869016567 
     6    -4.509273e+00     6.941216e-05
 * time: 23349.618561029434
F= -4.50927287416470 
F= -4.5092728917610874 
F= -4.5092728164927656 
F= -4.5092728980259014 
     7    -4.509273e+00     3.006336e-05
 * time: 28027.132161855698
F= -4.5092729002958176 
F= -4.50927288461689 
F= -4.50927289877002 
     8    -4.509273e+00     2.579953e-05
 * time: 31555.314399957657
F= -4.509272899666974 
F= -4.50927290675508 
F= -4.5092729403745038
F= -4.509272894742775
F= -4.5092729591768324 
     9    -4.509273e+00     4.023887e-05
 * time: 37346.814374923706
F= -4.5092729849628834
F= -4.509272952205056
F= -4.509272993048358 
    10    -4.509273e+00     3.574861e-05
 * time: 40652.16582298279
F= -4.5092730050285605
F= -4.509273043435568
F= -4.509273053946075
F= -4.509273080837682 
    11    -4.509273e+00     4.467094e-05
 * time: 44999.02312397957
F= -4.5092732196861642
F= -4.5092735683321523
F= -4.509273523863599 
    12    -4.509274e+00     9.423961e-05
 * time: 48244.51411008835
F= -4.509273791374507
F= -4.5092748461061034
F= -4.5092609228550677
F= -4.5092734668365786
F= -4.509273443446354
F= -4.509273322647026
F= -4.5092732752980393
F= -4.509273251845994
F= -4.5092732247850065
F= -4.509273189111617
F= -4.509273157333895
F= -4.509273130782579
F= -4.509273108386304
F= -4.509273089086743
F= -4.5092730721028644
F= -4.509273057023425
F= -4.5092730433268424
F= -4.509273031069279
F= -4.5092730200095716
F= -4.509273009942818
F= -4.509273000506798
F= -4.509272991747495
F= -4.509272983531698
F= -4.509272975819828
F= -4.5092729686011324
F= -4.5092729616603844
F= -4.5092729552624724
F= -4.50927294920335
F= -4.5092729433565895
F= -4.5092729378717173
F= -4.509272932668717
F= -4.509272927672122
F= -4.5092729228816064
F= -4.5092729182949816
F= -4.509272913904435
F= -4.5092729096995674
F= -4.5092729056604166
F= -4.5092729017945956
F= -4.5092728980402716
F= -4.509272894404665
F= -4.50927289093888
F= -4.5092728875175236
F= -4.5092728842502288
F= -4.5092728810681892
F= -4.509272878017274
F= -4.5092728751101006
F= -4.509272872315126
ERROR: AssertionError: B > A
Stacktrace:
 [1] (::LineSearches.HagerZhang{Float64,Base.RefValue{Bool}})(::Function, ::LineSearches.var"#Ï•dÏ•#6"{Optim.ManifoldObjective{OnceDifferentiable{Float64,Array{Float64,1},Array{Float64,1}}},Array{Float64,1},Array{Float64,1},Array{Float64,1}}, ::Float64, ::Float64, ::Float64) at /public3/home/sc55305/.julia/packages/LineSearches/Ki4c5/src/hagerzhang.jl:276
 [2] HagerZhang at /public3/home/sc55305/.julia/packages/LineSearches/Ki4c5/src/hagerzhang.jl:101 [inlined]
 [3] perform_linesearch!(::Optim.LBFGSState{Array{Float64,1},Array{Array{Float64,1},1},Array{Array{Float64,1},1},Float64,Array{Float64,1}}, ::LBFGS{Nothing,LineSearches.InitialStatic{Float64},LineSearches.HagerZhang{Float64,Base.RefValue{Bool}},Optim.var"#18#20"}, ::Optim.ManifoldObjective{OnceDifferentiable{Float64,Array{Float64,1},Array{Float64,1}}}) at /public3/home/sc55305/.julia/packages/Optim/uwNqi/src/utilities/perform_linesearch.jl:59
 [4] update_state!(::OnceDifferentiable{Float64,Array{Float64,1},Array{Float64,1}}, ::Optim.LBFGSState{Array{Float64,1},Array{Array{Float64,1},1},Array{Array{Float64,1},1},Float64,Array{Float64,1}}, ::LBFGS{Nothing,LineSearches.InitialStatic{Float64},LineSearches.HagerZhang{Float64,Base.RefValue{Bool}},Optim.var"#18#20"}) at /public3/home/sc55305/.julia/packages/Optim/uwNqi/src/multivariate/solvers/first_order/l_bfgs.jl:204
 [5] optimize(::OnceDifferentiable{Float64,Array{Float64,1},Array{Float64,1}}, ::Array{Float64,1}, ::LBFGS{Nothing,LineSearches.InitialStatic{Float64},LineSearches.HagerZhang{Float64,Base.RefValue{Bool}},Optim.var"#18#20"}, ::Optim.Options{Float64,Nothing}, ::Optim.LBFGSState{Array{Float64,1},Array{Array{Float64,1},1},Array{Array{Float64,1},1},Float64,Array{Float64,1}}) at /public3/home/sc55305/.julia/packages/Optim/uwNqi/src/multivariate/optimize/optimize.jl:57
 [6] optimize(::OnceDifferentiable{Float64,Array{Float64,1},Array{Float64,1}}, ::Array{Float64,1}, ::LBFGS{Nothing,LineSearches.InitialStatic{Float64},LineSearches.HagerZhang{Float64,Base.RefValue{Bool}},Optim.var"#18#20"}, ::Optim.Options{Float64,Nothing}) at /public3/home/sc55305/.julia/packages/Optim/uwNqi/src/multivariate/optimize/optimize.jl:35
 [7] #optimize#87 at /public3/home/sc55305/.julia/packages/Optim/uwNqi/src/multivariate/optimize/interface.jl:142 [inlined]
 [8] main() at ./none:45
 [9] top-level scope at ./timing.jl:174

Where the iteration parameter OptIter=20, and F is the target function.

You can see that the LBFGS iteration actually produces an increase in the objective function.

My preliminary judgment is that the step size of LBFGS needs to be adjusted, but I don’t know how to adjust the parameters so that the gradient is in the right direction

LBFGS(; m = 10,
        alphaguess = LineSearches.InitialStatic(),
        linesearch = LineSearches.HagerZhang(),
        P = nothing,
        precondprep = (P, x) -> nothing,
        manifold = Flat(),
        scaleinvH0::Bool = true && (typeof(P) <: Nothing))

Are there any suggestions about how best to tackle the problem?

cmarcotte · October 11, 2021, 9:31am

From here

With Optim.jl optimizers, you can set allow_f_increases=true in order to let increases in the loss function not cause an automatic halt of the optimization process. Using a method like BFGS or NewtonTrustRegion is not guaranteed to have monotonic convergence and so this can stop early exits which can result in local minima.
So I think this is a feature of (L)BFGS methods. I think you need to make use of the allow_f_increases = false option.

F-YF · October 11, 2021, 1:27pm

Thank you for your suggestion. I’ll think about it

F-YF · October 20, 2021, 2:51am

I don’t think your idea could solve my problem. allow_f_increases : Allow steps that increase the objective value. Defaults to false . Note that, when setting this to true , the last iterate will be returned as the minimizer even if the objective increased.

So the new problem is that LBFGs can not return the minimum value, only the value of the last iteration, but this problem also occurs when I set the parameter to false. It looks like I need to submit a new question to make that distinction.

goerch · October 20, 2021, 7:45am

So what you are saying is your objective is increasing although allow_f_increases is false? That sounds fishy to me and I’d probably file an issue against the package.

cmarcotte · October 20, 2021, 12:00pm

Yes, perhaps (L)BFGS is ill-suited to your problem? In my experience, the allow_f_increases = true option just allows the method to escape some number of local minima before finding a steady optimum. In concert with your store_trace = true option, you are always able to find the best solution from those explored after the optimization. If a different optimization algorithm is viable for your problem, then most of those in Optim.jl should be drop-in replacements. Alternatively, de rigueur appears to be chaining optimization algorithms (global into local, or fast into slow); I have had success with Nelder-Mead into BFGS for some mid-size problems.

F-YF · October 21, 2021, 1:28am

You are right, the objective function will go up although allow_f_increases is false

F-YF · October 21, 2021, 2:22am

I don’t think this method obviously work very well. If I can’t make my optimization function decrease monotonically with optim.jl, so what’s the point of this optimization process.

F-YF · October 21, 2021, 2:53am

You’re right, sometimes optimizations fall into local optimality, but I don’t think it’s a problem as long as it descends like a staircase. The Optim.jl optimization process has two different iterations, one is the iterative step controlled by parameter iterations, and the other is the search process of LBFGS , store_trace = true only store the value of the last LBFGS step.

What really confused me was the intermediate search step of LBFGS, you can see what I described in this problem. It clearly appears a value smaller than the last process in a certain LBFGS search process, but it still returns the larger value of the last step.

F-YF · October 21, 2021, 2:56am

I think it’s also possible that my function definition process has some unknown error, and I’ll check my code .

cvanaret · November 5, 2021, 5:10pm

The point of non-monotonic methods is, among others, to:

have a more flexible way to accept trial steps ;
allow full Newton steps (even though you temporarily degrade the objective and the constraint violation), which amounts to fast convergence.

It is a bit counterintuitive, but several techniques were devised this way (e.g. https://mathematicsinindustry.springeropen.com/track/pdf/10.1186/s13362-016-0029-1.pdf)

Topic		Replies	Views
Optim.jl returns `Stopped by an increasing objective: true` Optimization (Mathematical) optim	5	1671	September 26, 2018
How to use LBFGS to get the minimum value in the optimization process, but not the last one Performance package , optim , optimization	5	981	October 22, 2021
How to undershand the Objective goes greater while itering? Optimization (Mathematical) question	5	2088	October 19, 2019
`Optimization.LBFGS` fails to converge while `Optim.NelderMead()` works General Usage question , optim , optimization	11	279	March 29, 2025
Unsupported Argument in Optimization.solve Optimization (Mathematical) error-message	1	368	October 23, 2022

Gradient rise was obtained by optim.jl package optimization

Related topics