Hi all, I have a maximum likelihood estimator in which I need to integrate using a mixture of two normal distributions with zero mean. Without getting into much details, this entails estimating 3 parameters of the distribution, the standard deviations and the mixture probability.

The weird thing is that whenever I use a gradient-based method on Optim, the parameter corresponding to the mixture probability is always equal to the starting value at the end of the estimation (whether it converged or not), while when I use Nelder-Mead or SA, all parameters ‘move’.

Additionally, both methods lead to the correct parameters of the rest of the model, although when using gradient-based methods the standard deviation estimates are nonsense but are correct on Nelder-Mead.

Maybe this is not specifically about Julia and Optim and there is some problem when estimating mixture probabilities with gradient methods? Or is there something in my code that is causing this problem? I put below a simple (meaningless) example that shows how I’m constructing the draws.

PS: This also happens if I use quadrature methods.

```
using Distributions, Optim, NLSolversBase
function testll(param, MC)
baseprob = MC[:,2]
mix = 1 / (1 + exp(-param[1]))
draws = Array{eltype(param)}(undef, size(MC,1))
for i = 1:size(draws,1)
if baseprob[i] < mix
draws[i] = MC[i,1] * param[2]
else
draws[i] = MC[i,1] * param[3]
end
end
return log(mean(pdf.(Normal(0,10), draws)))
end
function wrap(param)
return testll(param, MC)
end
MC = [rand(Normal(0,1), 1_000) rand(Uniform(0,1), 1_000)]
func = TwiceDifferentiable(wrap, ones(3); autodiff=:forward);
opt = optimize(func, ones(3) , Optim.Options(show_trace=true))
opt1 = optimize(wrap, ones(3), Optim.Options(show_trace=true))
Optim.minimizer(opt) ## Optim.minimizer(opt)[1] is equal to starting value, no matter what it is, try ones(3) .+ 10
Optim.minimizer(opt1) ## all parameters change
```