ModelingToolkit & Distributions for MLE estimation

I am trying to use ModelingToolKit.jl to write a maximum loglikelihood estimator for generic distributions eg Univariate, Mixture, Multivariate etc.

using ModelingToolkit, Optimization, OptimizationOptimJL
using Distributions

f(θ) = MixtureModel([Exponential(θ[1]), Exponential(θ[2])], [θ[3], 1-θ[3]])
ℓ(θ, x) = -loglikelihood(f(θ), x) # = sum(logpdf(f(θ), y) for y in x)

# Data
N = 100
θtrue = [10, 1, 0.2]
x =  rand(f(θtrue), N)

# Observation as parameters (not sure if this is needed?)
@parameters y[1:N]

# variable
@variables θ[1:3]

mle = ℓ(θ, y)

# # Not tested yet, the code above need to work first
# @named sys = OptimizationSystem(mle, θ, y)
# θ0 = [θ[1] => 20, θ[2] => 4, θ[3] => 0.5]
# p = [y[i] => x[i] for i in eachindex(x)]

# prob = OptimizationProblem(sys, θ0, grad = true, hess = true)

However, this does not work:

ERROR: TypeError: non-boolean (Num) used in boolean context
....
[4] Exponential(θ::Num)
@ Distributions C:\Users\dmetivie\.julia\packages\Distributions\Spcmv\src\univariate\continuous\exponential.jl:29

I tried @register_symbolic Exponential(θ) but now the error is

ERROR: MethodError: no method matching MixtureModel(::Vector{Num}, ::Vector{Num})
Closest candidates are:
  MixtureModel(::Type{C}, ::AbstractArray) where C<:Distribution at C:\Users\metivier\.julia\packages\Distributions\Spcmv\src\mixtures\mixturemodel.jl:127
  MixtureModel(::Type{C}, ::AbstractArray, ::Vector{T}) where {C<:Distribution, T<:Real} at C:\Users\metivier\.julia\packages\Distributions\Spcmv\src\mixtures\mixturemodel.jl:144
  MixtureModel(::Vector{C}, ::VT) where {C<:Distribution, VT<:(AbstractVector{<:Real})} at C:\Users\metivier\.julia\packages\Distributions\Spcmv\src\mixtures\mixturemodel.jl:138
Stacktrace:
 [1] f(θ::Symbolics.Arr{Num, 1})
   @ Main c:\Users\metivier\Dropbox\PC (2)\Documents\Simulations\julia_mwe\ModelingToolkit_test_for_TrigMixture\mwe_modelingtoolkit_discourse.jl:4
 [2] ℓ(θ::Symbolics.Arr{Num, 1}, x::Symbolics.Arr{Num, 1})
   @ Main c:\Users\metivier\Dropbox\PC (2)\Documents\Simulations\julia_mwe\ModelingToolkit_test_for_TrigMixture\mwe_modelingtoolkit_discourse.jl:5
 [3] top-level scope
   @ c:\Users\metivier\Dropbox\PC (2)\Documents\Simulations\julia_mwe\ModelingToolkit_test_for_TrigMixture\mwe_modelingtoolkit_discourse.jl:25

It seems that Distributions.jl and ModelingToolKit.jl are not compatible in general, since there are type check, logsumexp with if statements, etc do not like Num.
I saw that DistributionsAD.jl was a good candidate to answer the issue. However, I was not able to understand how it works (could not find documentation).

Anyone with an idea how to implement that?

Why do you need to use ModelingToolkit here, as opposed to just calling Optimization.jl directly?

For reference this works

using Optimization, OptimizationOptimJL
using Distributions

f(θ) = MixtureModel([Exponential(θ[1]), Exponential(θ[2])], [θ[3], 1-θ[3]])
ℓ(θ, x) = -loglikelihood(f(θ), x) # = sum(logpdf(f(θ), y) for y in x)

# Data
N = 100
θtrue = [10, 1, 0.2]
x =  rand(f(θtrue), N)

θ0 = [20, 4, 0.5]
_p = x

OptFunc = OptimizationFunction(ℓ, Optimization.AutoForwardDiff())
l1 = ℓ(θ0, _p)
# prob = OptimizationProblem(OptFunc, θ0, _p) # it goes out of legal distributions bound!
prob = OptimizationProblem(OptFunc, θ0, _p, lb = [0.01, 0.01, 0.01], ub = [100, 100, 1])

# Start with some derivative-free optimizers
sol = solve(prob, SAMIN())

Does it mean that the problem is AutomaticDifferentiation compliant ?

Anyway, I wanted to use ModelingToolKit to get some of what they advertise on the doc "It provides many other features like auto-parallelism and sparsification too. Plus, you can hierarchically nest systems to generate huge optimization problems. ".

Would it make a difference here? I don’t know (guess not).
However, for the true problem I have in mind, the simplications of ModelingToolKit should be very useful.

Its simplifications can help quite a bit on constrained optimization problems and with MOI, but I’m not sure this case would get any benefits (at least with the current integration).

Sure, here it is a MWE to try to understand what is not working.
In my real application, I have periodic terms in the loglikelihood, analytically one can factor all terms with same value to optimize a shorter function.
Something like

θ[1]*y[1] + θ[2]*y[2] + ... + θ[1]*y[100] + θ[2]*y[101] + ... = θ[1]*sum(y[i] ...) + θ[2]*sum(y[i] ...)

That is where I believed ModelingToolKit could shine.

However, that little experiment triggered questions:

  • it seems that all logpdf I use are AD compatible with Optimisation (even logpdf of mixture). I believed it was not the case (that is why DistributionsAD existed). Am I missing something?
  • I read somewhere that if it was AD compatible, it was basically compatible with symbolic packages like ModelingToolKit. Here it does not seem to be true.