Hi all,

I have just released my package ProfileLikelihood.jl that will be fully registered in a few days. The aim of this package is to provide an easy interface for computing maximum likelihood estimates (MLEs), profile likelihoods, and confidence and prediction intervals. By building on top of the great interface from Optimization.jl, it is very easy to switch between different optimisers, use different types of automatic differentiation, to implement constraints.

I define what profile likelihoods, etc. are in the README of the package, but to quickly summarise: Suppose we have a likelihood function \mathcal L(\boldsymbol \theta) (basically a joint probability density function for some sample), and assume that we have some interest parameter \psi so that \boldsymbol \theta is decomposed as \boldsymbol \theta = (\psi, \boldsymbol \omega), where \boldsymbol \omega are the remaining ‘nuisance’ parameters. Currently I only support scalar \psi (edit: **as of v0.2, bivariate profiles are now also supported**). The *profile likelihood* of \psi is defined by \mathcal L_p(\psi_0) = \sup_{\boldsymbol \omega \in \Omega \mid \psi = \psi_0} \mathcal L(\psi, \boldsymbol \omega), i.e. the maximum of the likelihood when we fix \psi at a given value. This profile likelihood makes it easy to compute confidence intervals and approximate prediction intervals. Again, more detail is in the README, and a nice very recent resource is Simpson and Maclaren (2022).

To illustrate the package, I give one of the five examples from the docs below, where I estimate parameters for a logistic ODE, reproducing the first case study of Simpson and Maclaren (2022). The main idea is to first define the likelihood function so that a `LikelihoodProblem`

can then be built. Computing the results is as easy as then using `mle`

, `profile`

, and then `compute_prediction_intervals`

(if you have a prediction function you are interested in). Other examples for multiple linear regression, a simpler ODE \mathrm du/\mathrm dt = \lambda u, and estimating parameters for a PDE (the heat equation) are given in the README.

Suppose we have \mathrm du/\mathrm dt = \lambda u(1-u/K), u(0)=u_0, and our interest is in estimating (\lambda, K, u_0), fixing the standard deviation of the noise, \sigma, at \sigma=10 for this example. First, we need some data for this synthetic study:

```
using OrdinaryDiffEq, Random
λ = 0.01
K = 100.0
u₀ = 10.0
t = 0:100:1000
σ = 10.0
@inline function ode_fnc(u, p, t)
λ, K = p
du = λ * u * (1 - u / K)
return du
end
# Initial data is obtained by solving the ODE
tspan = extrema(t)
p = (λ, K)
prob = ODEProblem(ode_fnc, u₀, tspan, p)
sol = solve(prob, Rosenbrock23(), saveat=t)
Random.seed!(2828)
uᵒ = sol.u + σ * randn(length(t))
```

Next, a likelihood function is defined. I define the log-likelihood function, and use a normal likelihood since I added Gaussian noise to the data:

```
using ProfileLikelihood
function loglik_fnc2(θ, data, integrator)
λ, K, u₀ = θ
uᵒ, σ = data
integrator.p[1] = λ
integrator.p[2] = K
reinit!(integrator, u₀)
solve!(integrator)
return gaussian_loglikelihood(uᵒ, integrator.sol.u, σ, length(uᵒ))
end
```

When solving ODEs, it is assumed that the third argument is an `integrator`

, coming from the integrator interface from DifferentialEquations.jl. In other cases, there are only the first two arguments (see the regression example in the README). You could re-define an `ODEProblem`

many times, but this is slow. To now build the `LikelihoodProblem`

from all this, and constraining 0 \leq \lambda \leq 0.05, 50 \leq K \leq 150, and 0 \leq u_0 \leq 50:

```
using Optimization
lb = [0.0, 50.0, 0.0] # λ, K, u₀
ub = [0.05, 150.0, 50.0]
θ₀ = [λ, K, u₀]
syms = [:λ, :K, :u₀]
prob = LikelihoodProblem(
loglik_fnc2, θ₀, ode_fnc, u₀, maximum(t);
syms=syms,
data=(uᵒ, σ),
ode_parameters=[1.0, 1.0], # temp values for [λ, K]
ode_kwargs=(verbose=false, saveat=t),
f_kwargs=(adtype=Optimization.AutoFiniteDiff(),),
prob_kwargs=(lb=lb, ub=ub),
ode_alg=Rosenbrock23()
)
```

This function will construct the `LikelihoodProblem`

, a function of two arguments that in this case is a closure that contains the integrator for the problem. From this point, it is very easy to find the MLEs and the profile likelihoods:

```
using OptimizationNLopt
sol = mle(prob, NLopt.LD_LBFGS)
```

```
LikelihoodSolution. retcode: Failure
Maximum likelihood: -38.99053694428977
Maximum likelihood estimates: 3-element Vector{Float64}
λ: 0.010438031266786045
K: 99.59921873132551
u₀: 8.098422110755225
```

```
prof = profile(prob, sol; alg=NLopt.LN_NELDERMEAD, parallel=false)
```

```
ProfileLikelihoodSolution. MLE retcode: Failure
Confidence intervals:
95.0% CI for λ: (0.006400992274213644, 0.01786032876226762)
95.0% CI for K: (90.81154862835605, 109.54214763511888)
95.0% CI for u₀: (1.5919805025139593, 19.070831536649305)
```

We now have estimates for our parameters, along with quantified uncertainty from the confidence intervals. Using the `plot_profiles`

function, you can also visualise the profile likelihoods to assess e.g. smoothness and identifiability of the individual parameters:

```
using CairoMakie, LaTeXStrings
fig = plot_profiles(prof; latex_names=[L"\lambda", L"K", L"u_0"],
show_mles=true, shade_ci=true, nrow=1, ncol=3,
true_vals=[λ, K, u₀], # optional, gives the black lines below
fig_kwargs=(fontsize=30, resolution=(2109.644f0, 444.242f0)),
axis_kwargs=(width=600, height=300))
```

A last feature to show, again following Simpson and Maclaren (2022), is the construction of profile-wise profile likelihoods, which give profile likelihoods for some prediction function q(\boldsymbol \theta) conditional on an interest parameter \psi. In this case, suppose we are interested in quantifying the uncertainty in the function u(t) over many times. First, we define a prediction function that will give us our vector of interest:

```
function prediction_function(θ, data)
λ, K, u₀ = θ
t = data
prob = ODEProblem(ode_fnc, u₀, extrema(t), (λ, K))
sol = solve(prob, Rosenbrock23(), saveat=t)
return sol.u
end
```

Computing prediction intervals from this is simple using `get_prediction_intervals`

:

```
t_many_pts = LinRange(extrema(t)..., 1000)
parameter_wise, union_intervals, all_curves, param_range =
get_prediction_intervals(prediction_function, prof,
t_many_pts; parallel = true)
```

The `parameter_wise`

output gives prediction intervals based on varying one parameter at a time, and `union_intervals`

gives the union of the prediction intervals at each time; see the docs for more detail. Plotting these prediction intervals together with the true curve in red + the curve from the MLE in blue, we obtain the figure below (code for making this plot is in the docs):

In (a), (b), and (c), the black curves show 95% profile-wise prediction intervals for the respective parameters. These views allow us to identify how each parameter contributes to the uncertainty in our prediction, e.g. the uncertainty from \lambda is largest for early time, which makes sense since only K matters for large time, and this is what we see in (b) where the interval becomes large only for long time. In (d) I show the union of all these intervals in the black curve, compared to the interval coming from the full likelihood (which is more accurate, but takes a significantly longer time to compute) in magenta. We see that we have fully enclosed the true solution, and this sensitivity analysis of the individual contributions to the total uncertainty is sensible.

All feedback welcome!

Thanks,

Daniel.