Providing More Initial Parameter Values than there are Parameters in the Function Being Optimized

donkeysaddle · August 26, 2020, 5:53pm

Why does the “extra” parameter value affect the optimization? Apologies if this is in the Optim documentation - I didn’t see it.

For example:

using Optim
f(x) = (1.0 - x[1])^2 + 100.0 * (x[2] - x[1]^2)^2

x0 = [0.0, 0.0, 2.0]
optimize(f, x0)

gives a different result than

f(x) = (1.0 - x[1])^2 + 100.0 * (x[2] - x[1]^2)^2

x0 = [0.0, 0.0]
optimize(f, x0)

What is the “2.0” in the initial parameter value array doing?

juliohm · August 26, 2020, 6:27pm

I don’t know what it is doing, but probably the internals of Optim.jl are general enough so that the code doesn’t care about the exact length of the vector on updates. I would simply avoid passing the incorrect vector to the optimize function if it is a 2D objective.

donkeysaddle · August 26, 2020, 6:33pm

Could it be changing f(x) to this:

f(x) = (1.0 - x[1])^2 + 100.0 * (x[2] - x[1]^2)^2 + 0*x[3]

Would be nice to know - I made this mistake with a more complicated problem and the results I got were actually improved…

juliohm · August 26, 2020, 6:36pm

We can play the guessing game, but if you are really interested in this corner case, Optim.jl is open source! Type @edit optimize(f, x0) and read the code

The issue is probably not in the objective function, but in the vector updates. You can write x -= M*x for example without knowing the exact length of x.

Tamas_Papp · August 27, 2020, 8:47am

I can’t replicate this. I get the same minimizer (within numerical error) for the first two coordinates. Of course the third coordinate is random, but that should not matter.

If you are asking whey the results are not exactly equal: they are two different problems (in \mathbb{R}^2 and \mathbb{R}^3) and different steps are taken.

yha · August 27, 2020, 11:21am

Why would you think it changes f? The function you originally defined is applicable to vectors of length 3 (or any length ≥2) as is:

f(x) = (1.0 - x[1])^2 + 100.0 * (x[2] - x[1]^2)^2
f([1,2,3]) ## == 100.0

pkofod · August 27, 2020, 12:45pm

Well, I don’t see why it would affect the result. Are you sure you don’t have a sum of x or norm or something like that that would be affected by the third element?

Tamas_Papp · August 27, 2020, 1:31pm

I don’t think it does (in the \approx sense). Did you run the example?

pkofod · August 27, 2020, 1:46pm

Ah, so actually it does in this case because you’re using Nelder-Mead and the initial simplex is affected by the initial x. The centroid element for x[1] and x[2] changes which changes the progression of the algorithm. Edit: but the solution is obviously approximately the same (one has element just above one and the other just below, and the minimizer is [1,1])

donkeysaddle · September 3, 2020, 5:51pm

Thanks for all the responses. This makes sense, much more plausible than my hypothesis.

Thanks again,

DS

Topic		Replies	Views
Parameters Optimization with Optim General Usage optim	4	180	February 12, 2025
Parameters in lower, upper and initial_x in optim New to Julia	2	752	December 2, 2021
Why do I get the same parameter after the OptimizationProblem General Usage question	23	788	March 7, 2024
Problem not updating param while running optimzation General Usage optimization , differentialequation	0	32	February 10, 2025
Syntactical Question about optimize Optimization (Mathematical) optim	3	633	September 11, 2018

Providing More Initial Parameter Values than there are Parameters in the Function Being Optimized

Related topics