Problem passing gradients into Optim

jmcastro2109 · November 3, 2021, 3:37pm

Hi! I am trying to minimize a likelihood function for a large number of parameters. Since, the problem is quite large it would be very useful to pass the analytical gradients and the Hessian into the optimization problem. However, for some reason it is not working, when I pass the gradients and the Hessian into the second function the optimization takes forever as if it was not updating.

Also, if anybody has suggestions into how to make the functions faster and more efficient I would appreciate it a lot.

See the following MWE

using Optim, NLSolversBase, Random
using LinearAlgebra


###############################################################
# Loglikelihood
###############################################################

const J = 75  # This is greek letter \Iota
const Ι = 10
const N = 1000
const Mat1 = ones(Int8,Ι,N)
const Mat2 = ones(Int8,J,N)
const Mat3 =  rand(Ι,J,N) .>0.9
NumberParams = Ι 
const τ = rand(Ι,J)

#NumberParams = 2


function Log_Likelihood(N,Ι,J,τ,Mat1,Mat2,Mat3,params) 
    llike = 0.0
    θ = params[1]
    γ = params[2:end]
    γ = vcat(γ,1.0)

    pvals = zeros(Real,Ι,J,N)

    for n = 1 : N , j = 1 : J  
            denom_index = 0.0 :: Real
            for i = 1 : Ι
                denom_index +=  Mat2[j,n]*Mat1[i,n]*exp(-θ*τ[i,j]+ γ[i])
            end    
            for i = 1 : Ι
                llike += Mat3[i,j,n]*Mat2[j,n]*Mat1[i,n]*(-θ*τ[i,j] + γ[i]  - log(denom_index))
            end    
    end    
    llike = -llike
end



function fgh!(F,G,H,params)
    p = size(params)[1]
    vec=zeros(p)
    mat=zeros(p,p)
    llike = 0.0
    θ = params[1]
    γ = params[2:end]
    γ = vcat(γ,1.0)

    pvals = zeros(Real,Ι,J,N)

    for n = 1 : N , j = 1 : J  
            denom_index = 0.0 :: Real
            for i = 1 : Ι
                denom_index +=  Mat2[j,n]*Mat1[i,n]*exp(-θ*τ[i,j]+ γ[i])
            end    
            for i = 1 : Ι
                pvals[i,j,n] = Mat2[j,n]*Mat1[i,n]*(-θ*τ[i,j] + γ[i]  - log(denom_index))
                llike += Mat3[i,j,n]*pvals[i,j,n]
                xvec = vcat(τ[i,j],zeros(Ι))
                xvec[i+1] = 1
                mat .+= (Mat2[j,n]*Mat1[i,n]*exp(pvals[i,j,n])*xvec*xvec')[1:end-1,1:end-1]
            end    
    end      


  if G != nothing

    Difference = Mat3-exp.(pvals)

    for n = 1 : N , j = 1 : J
        for i = 1 : Ι
        vec[1] += Mat2[j,n]*Mat1[i,n]*Difference[i,j,n]*τ[i,j]
        end    
        vec[2:end] += (Mat2[j,n]*(Mat1[:,n].*Difference[:,j,n])'*I)[1:end-1]
    end 

    G .= vec
  
    end  

  if H != nothing
    
    H .= mat

  end

  F == nothing || return -llike
  nothing
end


Gtest = zeros(NumberParams)
Htest = zeros(NumberParams,NumberParams)

fgh!(0,Gtest,Htest,ones(NumberParams))


  

func = TwiceDifferentiable(vars -> Log_Likelihood(N,Ι,J,τ,Mat1,Mat2,Mat3,vars[1:NumberParams]),
                           ones(NumberParams); autodiff=:forward);



opt1 = @time Optim.optimize(func, ones(NumberParams),show_trace = true)

parameters =  Optim.minimizer(opt1)


opt_explicit = @time Optim.optimize(Optim.only_fgh!(fgh!), ones(NumberParams), Optim.Newton())
parameters_explicit =  Optim.minimizer(opt_explicit)

Thanks so much

goerch · November 3, 2021, 8:31pm

Is this a duplicate of Improving Performance of a Loop?

jmcastro2109 · November 3, 2021, 8:56pm

No, it is different. Ignore the performance issues we already discussed, by some reason the optimization algorithm is not updating the gradients.

goerch · November 4, 2021, 4:25am

It might be easier to work with a corrected MWE then?

jmcastro2109 · November 4, 2021, 11:14pm

Hi, I streamlined the MWE a little bit and incorporated the tips you gave me in this new post Improving performance of a number of functions for large scale gradient based optimization

Topic		Replies	Views
'Optim' fails when supplying gradient and Hessian New to Julia optim	5	324	May 2, 2023
Optim user-supplied gradient and hessian MLE Optimization (Mathematical) optim	3	1008	February 22, 2022
Improving performance of a number of functions for large scale gradient based optimization Performance performance , optim , memory-allocation , loops	11	832	November 5, 2021
Calculate Hessian after providing gradient: NLSolversBase, Optim Performance	0	380	June 17, 2020
Gradient not updating in call to optimize (Optim.jl) Optimization (Mathematical) package , optim	2	1095	April 17, 2021

Problem passing gradients into Optim

Related topics