# Problem passing gradients into Optim

Hi! I am trying to minimize a likelihood function for a large number of parameters. Since, the problem is quite large it would be very useful to pass the analytical gradients and the Hessian into the optimization problem. However, for some reason it is not working, when I pass the gradients and the Hessian into the second function the optimization takes forever as if it was not updating.

Also, if anybody has suggestions into how to make the functions faster and more efficient I would appreciate it a lot.

See the following MWE

``````using Optim, NLSolversBase, Random
using LinearAlgebra

###############################################################
# Loglikelihood
###############################################################

const J = 75  # This is greek letter \Iota
const Ι = 10
const N = 1000
const Mat1 = ones(Int8,Ι,N)
const Mat2 = ones(Int8,J,N)
const Mat3 =  rand(Ι,J,N) .>0.9
NumberParams = Ι
const τ = rand(Ι,J)

#NumberParams = 2

function Log_Likelihood(N,Ι,J,τ,Mat1,Mat2,Mat3,params)
llike = 0.0
θ = params
γ = params[2:end]
γ = vcat(γ,1.0)

pvals = zeros(Real,Ι,J,N)

for n = 1 : N , j = 1 : J
denom_index = 0.0 :: Real
for i = 1 : Ι
denom_index +=  Mat2[j,n]*Mat1[i,n]*exp(-θ*τ[i,j]+ γ[i])
end
for i = 1 : Ι
llike += Mat3[i,j,n]*Mat2[j,n]*Mat1[i,n]*(-θ*τ[i,j] + γ[i]  - log(denom_index))
end
end
llike = -llike
end

function fgh!(F,G,H,params)
p = size(params)
vec=zeros(p)
mat=zeros(p,p)
llike = 0.0
θ = params
γ = params[2:end]
γ = vcat(γ,1.0)

pvals = zeros(Real,Ι,J,N)

for n = 1 : N , j = 1 : J
denom_index = 0.0 :: Real
for i = 1 : Ι
denom_index +=  Mat2[j,n]*Mat1[i,n]*exp(-θ*τ[i,j]+ γ[i])
end
for i = 1 : Ι
pvals[i,j,n] = Mat2[j,n]*Mat1[i,n]*(-θ*τ[i,j] + γ[i]  - log(denom_index))
llike += Mat3[i,j,n]*pvals[i,j,n]
xvec = vcat(τ[i,j],zeros(Ι))
xvec[i+1] = 1
mat .+= (Mat2[j,n]*Mat1[i,n]*exp(pvals[i,j,n])*xvec*xvec')[1:end-1,1:end-1]
end
end

if G != nothing

Difference = Mat3-exp.(pvals)

for n = 1 : N , j = 1 : J
for i = 1 : Ι
vec += Mat2[j,n]*Mat1[i,n]*Difference[i,j,n]*τ[i,j]
end
vec[2:end] += (Mat2[j,n]*(Mat1[:,n].*Difference[:,j,n])'*I)[1:end-1]
end

G .= vec

end

if H != nothing

H .= mat

end

F == nothing || return -llike
nothing
end

Gtest = zeros(NumberParams)
Htest = zeros(NumberParams,NumberParams)

fgh!(0,Gtest,Htest,ones(NumberParams))

func = TwiceDifferentiable(vars -> Log_Likelihood(N,Ι,J,τ,Mat1,Mat2,Mat3,vars[1:NumberParams]),
ones(NumberParams); autodiff=:forward);

opt1 = @time Optim.optimize(func, ones(NumberParams),show_trace = true)

parameters =  Optim.minimizer(opt1)

opt_explicit = @time Optim.optimize(Optim.only_fgh!(fgh!), ones(NumberParams), Optim.Newton())
parameters_explicit =  Optim.minimizer(opt_explicit)

``````

Thanks so much

Is this a duplicate of Improving Performance of a Loop?

No, it is different. Ignore the performance issues we already discussed, by some reason the optimization algorithm is not updating the gradients.

It might be easier to work with a corrected MWE then?

Hi, I streamlined the MWE a little bit and incorporated the tips you gave me in this new post Improving performance of a number of functions for large scale gradient based optimization

1 Like