# Can't seem to get Optimization.jl to converge to solution in Monte Carlo simulation but the objective function seems to work okay otherwise

A basic binary logic works fine

``````using Optimization, OptimizationOptimJL, OptimizationBBO, OptimizationMOI, ForwardDiff, ModelingToolkit, Random

function SimulateBinaryLogit(x, Beta)
N = size(x, 1)
J = size(Beta, 1)
epsilon = -log.(-log.(rand(N, J)))
Beta_augmented = [Beta zeros(2)]
utility = x * Beta_augmented + epsilon
(maxvalue, maxindex) = findmax(utility, dims=2)
choice = getindex.(maxindex,2)
y = choice .== 1
end

function BinaryLogitLL(u,p)
x = p[:,1:2]
y = p[:,3]
Lambda_xb = exp.(x * u) ./ (1 .+ exp.(x * u))
ll_i = y .* log.(Lambda_xb) + (1 .- y) .* log.(1 .- Lambda_xb)
LL = -sum(ll_i)
end

Random.seed!(1)
N = 1000
Beta = [0.5; 0.5]
income = randn(N, 1)
x = [ones(N, 1) income]
y = SimulateBinaryLogit(x, Beta)
p = [x y]
u0 = [0.0; 0.0]

optf = OptimizationFunction(BinaryLogitLL, Optimization.AutoForwardDiff())
prob = OptimizationProblem(optf, u0, p, lb = [-10.0, -10.0], ub = [10.0, 10.0])
sol = solve(prob, BFGS())
``````

Everything is fine so far. However when I try and expand this to a Monte Carlo simulation

``````struct input
y
x
R
end

function BinaryLogitSimulatedLL(u, p)
y = p.y
x = p.x
R = p.R
Random.seed!(1)
N = size(y, 1)
Simulated_y = Array{Float64, 2}(undef, N, R)
for i in 1:R
Simulated_y[:,i] = SimulateBinaryLogit(x, u)
end
SimProb = mean(Simulated_y, dims=2)
ll_i = y .* log.(SimProb) .+ (1 .- y) .* log.(1 .- SimProb)
LL = -sum(ll_i)
end

R = 1000
p = input(y,x,R)
u = [0.0; 0.0]

optf = OptimizationFunction(BinaryLogitSimulatedLL, Optimization.AutoForwardDiff())
prob = OptimizationProblem(optf, u, p, lb = -1.0*ones(2), ub = ones(2))
sol = solve(prob, BFGS())
``````

It seems that whatever initial values I give the routine tells me that is optimal even though I set the optimal solution in advance to be [0.5;0.5]

Try something that isnâ€™t BFGS. For example, try OptimizationOptimisers.Adam() as the method.

I replaced the final line with

``````sol = solve(prob, ADAM(); maxiters = 1000)
``````

and imported `OptimizationOptimisers`. Still the same result thanks for the suggestion though. If you know where I can look for advice on the package generally not converging/staying at whatever initial values are provided I would be grateful

The problem is not the optimizer. The problem is that your `BinaryLogitSimulatedLL` is non-differentiable and `ForwardDiff` returns a gradient of `[0.0, 0.0]`:

``````julia> ForwardDiff.gradient(x -> BinaryLogitSimulatedLL(x, p), [0.5, 0.5])
2-element Vector{Float64}:
0.0
0.0

julia> ForwardDiff.gradient(x -> BinaryLogitSimulatedLL(x, p), [0.0, 0.0])
2-element Vector{Float64}:
0.0
0.0
``````

I really donâ€™t understand why an error doesnâ€™t get thrown.

I think itâ€™s because your `choice .== 1` return from the `SimulateBinaryLogit` returns a 0 matrix rather than erroring:

``````julia> ForwardDiff.jacobian(x -> SimulateBinaryLogit(p.x, x), [0.0, 0.0])
1000Ă—2 BitMatrix:
0 0
0 0
...
``````

Thank you odow. I feel so stupid I didnâ€™t write the call to simulation function properly and need to change my input struct to include the beta value that influences what the output vector looks like. I also appreciate you for giving me a method to troubleshoot where I went wrong for future reference

1 Like

I feel so stupid

Donâ€™t! This looks highly non-trivialâ€¦

I didnâ€™t write the call to simulation function properly and need to change my input struct to include the beta value that influences what the output vector looks like

To clarify, the problem is that an optimizer like BGFS needs to evaluate the gradient of your objective function. But your objective function involves discrete terms like `findmax` which is non-differentiable so you cannot compute a gradient of it.

You either need to:

• Use a derivative free optimization method
• Re-think how you are computing your objective so that it becomes differentiable
``````using Optimization, OptimizationOptimJL, OptimizationBBO, OptimizationMOI, ForwardDiff, ModelingToolkit, Random, OptimizationOptimisers, Statistics

function SimulateBinaryLogit(x, Beta)
# Random.seed!(1)
N = size(x, 1)
J = size(Beta, 1)
epsilon = -log.(-log.(rand(N, J)))
Beta_augmented = [Beta zeros(2)]
utility = x * Beta_augmented + epsilon
(maxvalue, maxindex) = findmax(utility, dims=2)
choice = getindex.(maxindex,2)
y = choice .== 1
y = convert(Array{Float64}, y)
end

function BinaryLogitLL(u,p)
x = p[:,1:2]
y = p[:,3]
Lambda_xb = exp.(x * u) ./ (1 .+ exp.(x * u))
ll_i = y .* log.(Lambda_xb) + (1 .- y) .* log.(1 .- Lambda_xb)
LL = -sum(ll_i)
end

function BinaryLogitSimulatedLL(u, p)
y = p.y
x = p.x
R = p.R
Beta = u
Random.seed!(1)
# N = size(y, 1)
Simulated_y = Array{Float64, 2}(undef, R, R)
for i in 1:R
Simulated_y[:,i] = SimulateBinaryLogit(x, Beta)
end
SimProb = mean(Simulated_y, dims=2)
ll_i = y .* log.(SimProb) .+ (1 .- y) .* log.(1 .- SimProb)
LL = -sum(ll_i)
end

struct input
y
x
R
end

Random.seed!(1)
N = 1000
Beta = [0.5; 0.5]
income = randn(N, 1)
x = [ones(N, 1) income]
y = SimulateBinaryLogit(x, Beta)
R = 1000
p = input(y,x,R)
u = [0.0; 0.0]

p = [x y]
``````

I converted the discrete function output and still feel no closer to making my original problem work. I am confused by the fact that the earlier function has no problem with gradient calculation but the simulation does.

As @odow pointed out itâ€™s a non-differentiable loss, try `NelderMead()` or `SAMIN()` for the optimizer.

I guess the problem comes from adding more dimensions to the objective function because the objective function I am truly interested in has five unknowns

``````using Distributions, StatsBase, Optimization, OptimizationOptimJL, OptimizationBBO, OptimizationMOI, ForwardDiff, ModelingToolkit, Random, OptimizationOptimisers, Statistics

function SimulateMNProbit(x, Betavec, Omega)
N = size(x, 1)
K = size(x, 2)
J = size(Betavec, 1)/K + 1
J = convert(Int64, J)
Beta = reshape(Betavec, K, J - 1)
xb = x * Beta
diff = [zeros(N, 1) xb + rand(MvNormal(zeros(J - 1), Omega), N)']
(maxvalue, maxindex) = findmax(diff, dims=2)
choice = getindex.(maxindex,2)
end

function MNProbitSimulatedLL(param, y, x, R)
BetaV = param[1:end - 1]
Rho = param[end]
Omega = [1 Rho; Rho 1]
Random.seed!(1)
N = size(y, 1)
J = maximum(y)
Simulated_y = Array{Float64, 2}(undef, N, R)
SimulatedProb = Array{Float64, 2}(undef, N, J)
MyIndex = Array{Float64, 2}(undef, N, J)
for count = 1:R
Simulated_y[:, count] = SimulateMNProbit(x, BetaV, Omega)
end
for j = 1:J
a = mean(Simulated_y .== j, dims=2)
SimulatedProb[:,j] = a
MyIndex[:,j] = y .== j
end
ll_i = sum(MyIndex .* log.(SimulatedProb), dims=2)
LL = -sum(ll_i)
end

function MNProbitSimulatedLLu(u, p)
y = p.y
x = p.x
R = p.R
BetaV = u[1:end - 1]
Rho = u[end]
Omega = [1 Rho; Rho 1]
Random.seed!(1)
N = size(y, 1)
J = maximum(y)
Simulated_y = Array{Float64, 2}(undef, N, R)
SimulatedProb = Array{Float64, 2}(undef, N, J)
MyIndex = Array{Float64, 2}(undef, N, J)
for count = 1:R
Simulated_y[:, count] = SimulateMNProbit(x, BetaV, Omega)
end
for j = 1:J
a = mean(Simulated_y .== j, dims=2)
SimulatedProb[:,j] = a
MyIndex[:,j] = y .== j
end
ll_i = sum(MyIndex .* log.(SimulatedProb), dims=2)
LL = -sum(ll_i)
end

struct input
y
x
R
end

Random.seed!(1)
N = 1000
Beta = 0.5*ones(4)
income = rand(N, 1)
x      = [ones(N, 1) income]
omega = [1 0.5;0.5 1];
y = SimulateMNProbit(x, Beta, omega)
R = 100
p = input(y, x, R)
u =  0.5*ones(5)

LL = MNProbitSimulatedLL(Beta, y, x, R)

lb = [-10;-10;-10;-10;-1.0]
ub = [10;10;10;10;1.0]

optf = OptimizationFunction(MNProbitSimulatedLLu, Optimization.AutoForwardDiff())
prob = OptimizationProblem(optf, u, p, lb = lb, ub = ub)