Problem passing gradients into Optim

Hi! I am trying to minimize a likelihood function for a large number of parameters. Since, the problem is quite large it would be very useful to pass the analytical gradients and the Hessian into the optimization problem. However, for some reason it is not working, when I pass the gradients and the Hessian into the second function the optimization takes forever as if it was not updating.

Also, if anybody has suggestions into how to make the functions faster and more efficient I would appreciate it a lot.

See the following MWE

using Optim, NLSolversBase, Random
using LinearAlgebra


###############################################################
# Loglikelihood
###############################################################

const J = 75  # This is greek letter \Iota
const Ι = 10
const N = 1000
const Mat1 = ones(Int8,Ι,N)
const Mat2 = ones(Int8,J,N)
const Mat3 =  rand(Ι,J,N) .>0.9
NumberParams = Ι 
const τ = rand(Ι,J)

#NumberParams = 2


function Log_Likelihood(N,Ι,J,τ,Mat1,Mat2,Mat3,params) 
    llike = 0.0
    θ = params[1]
    γ = params[2:end]
    γ = vcat(γ,1.0)

    pvals = zeros(Real,Ι,J,N)

    for n = 1 : N , j = 1 : J  
            denom_index = 0.0 :: Real
            for i = 1 : Ι
                denom_index +=  Mat2[j,n]*Mat1[i,n]*exp(-θ*τ[i,j]+ γ[i])
            end    
            for i = 1 : Ι
                llike += Mat3[i,j,n]*Mat2[j,n]*Mat1[i,n]*(-θ*τ[i,j] + γ[i]  - log(denom_index))
            end    
    end    
    llike = -llike
end



function fgh!(F,G,H,params)
    p = size(params)[1]
    vec=zeros(p)
    mat=zeros(p,p)
    llike = 0.0
    θ = params[1]
    γ = params[2:end]
    γ = vcat(γ,1.0)

    pvals = zeros(Real,Ι,J,N)

    for n = 1 : N , j = 1 : J  
            denom_index = 0.0 :: Real
            for i = 1 : Ι
                denom_index +=  Mat2[j,n]*Mat1[i,n]*exp(-θ*τ[i,j]+ γ[i])
            end    
            for i = 1 : Ι
                pvals[i,j,n] = Mat2[j,n]*Mat1[i,n]*(-θ*τ[i,j] + γ[i]  - log(denom_index))
                llike += Mat3[i,j,n]*pvals[i,j,n]
                xvec = vcat(τ[i,j],zeros(Ι))
                xvec[i+1] = 1
                mat .+= (Mat2[j,n]*Mat1[i,n]*exp(pvals[i,j,n])*xvec*xvec')[1:end-1,1:end-1]
            end    
    end      


  if G != nothing

    Difference = Mat3-exp.(pvals)

    for n = 1 : N , j = 1 : J
        for i = 1 : Ι
        vec[1] += Mat2[j,n]*Mat1[i,n]*Difference[i,j,n]*τ[i,j]
        end    
        vec[2:end] += (Mat2[j,n]*(Mat1[:,n].*Difference[:,j,n])'*I)[1:end-1]
    end 

    G .= vec
  
    end  

  if H != nothing
    
    H .= mat

  end

  F == nothing || return -llike
  nothing
end


Gtest = zeros(NumberParams)
Htest = zeros(NumberParams,NumberParams)

fgh!(0,Gtest,Htest,ones(NumberParams))


  

func = TwiceDifferentiable(vars -> Log_Likelihood(N,Ι,J,τ,Mat1,Mat2,Mat3,vars[1:NumberParams]),
                           ones(NumberParams); autodiff=:forward);



opt1 = @time Optim.optimize(func, ones(NumberParams),show_trace = true)

parameters =  Optim.minimizer(opt1)


opt_explicit = @time Optim.optimize(Optim.only_fgh!(fgh!), ones(NumberParams), Optim.Newton())
parameters_explicit =  Optim.minimizer(opt_explicit)


Thanks so much

Is this a duplicate of Improving Performance of a Loop?

No, it is different. Ignore the performance issues we already discussed, by some reason the optimization algorithm is not updating the gradients.

It might be easier to work with a corrected MWE then?

Hi, I streamlined the MWE a little bit and incorporated the tips you gave me in this new post Improving performance of a number of functions for large scale gradient based optimization

1 Like