# Conjugate Gradient optimization with set of functions

This post is related to a previous post of mine and I prefered to expose the problem separatey. I have to optimize a set of functions with conjugate gradient optimization, but I have to define the whole set of functions first. The problem is that I have a lot of functions to define, the number of arguments is variable. To keep it simple, I will take the example of a maximum of three arguments: x_1, x_2 and x_3. The coefficients of the functions are given by a matrix X with the same number of columns than the number of existing arguments, say

``````using StatsBase
X=rand(5,3)
``````

With the three arguments, I can create 7 functions: f(x_1), f(x_2), f(x_3), f(x_1,x_2), f(x_1,x_3), f(x_2,x_3) and f(x_1,x_2,x_3). The coefficients of each function are obtained selecting the corresponding submatrix of X, with each column corresponding to each variable. For example, to the function f(x_2,x_3), I take

``````Xsub=X[:,2:3]
``````

and I can construct the function to optimize:

``````using Optim

f(x2,x3)=sum([3*transpose(Xsub[i,:])*[x2^2,x3^2] for i in 1:size(X,1)])

function g!(storage, x2,x3)
storage[1]=sum([6*Xsub[i,1]*x2 for i in 1:size(X,1)])
storage[2]=sum([6*Xsub[i,2]*x3 for i in 1:size(X,1)])
end

``````

Is it possible to build only one code that will create every possible subset of variables and create all possible functions exactly as above to be optimized, rather to define all the 7 functions separately?

First a quick note that you can write that more succinctly and probably quicker to calculate as:

``````f(x2,x3) = 3sum(Xsub * [x2^2,x3^2])
``````

But given that, a general strategy might be to parameterize your functions such that one Julia function encapsulates them all (assuming this can actually be done in your case, for which to know weâ€™d probably need more examples of what these functinos are). E.g. something like:

``````f(x1,x2,x3,a,c1,p1,c2,p2,c3,p3) = a * sum(X * [c1*x1^p1, c2*x2^p2, c3*x3^p3])
``````

and then optimization involves looping over different values of the parameters of the function rather than over different functions. For the cases where you donâ€™t â€śhaveâ€ť a certain argument, you would just set its coefficient to zero, it doesnâ€™t screw up `ConjugateGradient()` to have a zero gradient in a given direction.

1 Like

Ah thank you for your quick note @marius311! The general strategy seems good! I will try it, because I started coding the function with all variables! So I can set the corresponding parameters to zero when I want a subfunction, right? And concerning the equations for the gradient? Because the number of equations varies with the number of variables. I also set some parameters to have `storage[j]=0` for the variables that donâ€™t appear? And how can I set the parameters to zero in the command `optimize(f, g!, ones(2), ConjugateGradient())`? With `map(x->optimize(f(x),g!(x),...), ...)` ?

If your functions are as simple as the example here, I wouldnâ€™t even hand-code the gradient, Iâ€™d bet Juliaâ€™s automatic differentiation libraries are going to do really well on them. I forget exactly the interface, but Optim may even do it automatically for you.

I was imagining something along the lines of:

``````for (a,c1,p1,c2,p2,c3,p3) in possible_configurations
end
``````

and then `possible_configurations` is a list of things you want, some of which have the coefficients set to zero.

1 Like

I will explore your loop proposal, thanks It seems a nice idea! For the optimization, I must provide the gradient when I use `ConjugateGradient()`. It has less evaluations to do and converges quicker than a gradient-free optimizer.

Ah wait! I just realized something : in the help page of `Optim.jl` there is an example with `ConjugateGradient()` , the 2D Rosenbrock function. In the example, they provide the gradient function. I tried to run it without the gradient function and it worked. But as we do not provide the gradient function, the number of gradient calls is higher. When we provide the gradient function with `g!` , it lowers the number of gradient calls

Iâ€™m too sadâ€¦ I tried to apply your suggestions to one of my functions and it doesnâ€™t work :

``````X=rand(50,6)
h(beta0,beta1,beta2,beta3,beta4,beta5,c1,c2,c3,c4,c5)=-sum(X*[beta0,c1*beta1,c2*beta2,c3*beta3,c4*beta4,c5*beta5])+sum(log.(1 .+exp.(X*[beta0,c1*beta1,c2*beta2,c3*beta3,c4*beta4,c5*beta5])))+(sum([c1,c2,c3,c4,c5])+1)/2*log(2*pi)+1/2*sum([beta0,c1*beta1,c2*beta2,c3*beta3,c4*beta4,c5*beta5].^2)
``````

To test the function, I tried:

``````optimize((beta0,beta1,beta2,beta3,beta4,beta5) -> h(beta0,beta1,beta2,beta3,beta4,beta5,1.0,1.0,1.0,1.0,1.0), ones(6), ConjugateGradient())
``````

but I get the following error:

``````ERROR: MethodError: no method matching (::getfield(Main, Symbol("##57#58")))(::Array{Float64,1})
Closest candidates are:
#57(::Any, ::Any, ::Any, ::Any, ::Any, ::Any) at none:1
Stacktrace:
[3] (::getfield(NLSolversBase, Symbol("#g!#15")){getfield(Main, Symbol("##57#58")),DiffEqDiffTools.GradientCache{Nothing,Nothing,Nothing,Val{:central},Float64,Val{true}}})(::Array{Float64,1}, ::Array{Float64,1}) at C:\Users\User\.julia\packages\NLSolversBase\NsXIC\src\objective_types\oncedifferentiable.jl:56
[4] (::getfield(NLSolversBase, Symbol("#fg!#16")){getfield(Main, Symbol("##57#58"))})(::Array{Float64,1}, ::Array{Float64,1}) at C:\Users\User\.julia\packages\NLSolversBase\NsXIC\src\objective_types\oncedifferentiable.jl:60
[6] initial_state(::ConjugateGradient{Float64,Nothing,getfield(Optim, Symbol("##30#32")),LineSearches.InitialHagerZhang{Float64},LineSearches.HagerZhang{Float64,Base.RefValue{Bool}}}, ::Optim.Options{Float64,Nothing}, ::OnceDifferentiable{Float64,Array{Float64,1},Array{Float64,1}}, ::Array{Float64,1}) at C:\Users\User\.julia\packages\Optim\L5T76\src\multivariate\solvers\first_order\cg.jl:113
[7] optimize at C:\Users\User\.julia\packages\Optim\L5T76\src\multivariate\optimize\optimize.jl:33 [inlined]
[8] #optimize#93(::Bool, ::Symbol, ::Function, ::Function, ::Array{Float64,1}, ::ConjugateGradient{Float64,Nothing,getfield(Optim, Symbol("##30#32")),LineSearches.InitialHagerZhang{Float64},LineSearches.HagerZhang{Float64,Base.RefValue{Bool}}}, ::Optim.Options{Float64,Nothing}) at C:\Users\User\.julia\packages\Optim\L5T76\src\multivariate\optimize\interface.jl:116
[9] optimize(::Function, ::Array{Float64,1}, ::ConjugateGradient{Float64,Nothing,getfield(Optim, Symbol("##30#32")),LineSearches.InitialHagerZhang{Float64},LineSearches.HagerZhang{Float64,Base.RefValue{Bool}}}, ::Optim.Options{Float64,Nothing}) at C:\Users\User\.julia\packages\Optim\L5T76\src\multivariate\optimize\interface.jl:115 (repeats 2 times)
[10] top-level scope at none:0
``````

I think it pehaps comes from an Integer of Float problem, but I donâ€™t see whereâ€¦ When I write
`h(2,3,2,2,2,3,1,1,1,1,1)` for example, it works and returns a value.

Otherwise, How should I configure `possible_configurations` in the `for` loop for Julia to take `(c1,c2,c3,c4,c5)` as a vector of values at each iteration? With `zip`?

The issue is just that Optim passes you an array rather than individual parameters, so you could just do something like:

``````f(betas) = h(betas...,1,1,1,1,1)

optimize(
f,
ones(6),
autodiff = :forward,
x_tol = 1e-7
)
``````

I looked up the Optim automatic differentiation API and its as easy as adding that `autodiff` parameter, so no need to hand-code gradients.

1 Like

Thanks @marius311 Worked perfectly Now I just have to code the `for` loop as you suggested Iâ€™ll keep you in touch

I was able to implement your loop suggestion with this new way of passing the function to Optim!! Thanks a lot, @marius311!!!

1 Like