I anticipate that this question could be down to my ignorance in statistics.
I’m trying to apply the DifferentialEquations.jl tutorial on how to use Generalized Likelihood to a particular case in which the data is a one dimensional time series.
My model is a SIRD with 4 age classes with three parameters [β, λ_R, λ_D] I’m trying to calibrate with respect to cumulated deaths. Here is the code:
# Basic Utilities
using StaticArrays, RecursiveArrayTools, ParameterizedFunctions
# Numerical Computation
using DifferentialEquations, LinearAlgebra, DiffEqBayes, OrdinaryDiffEq, DiffEqParamEstim, Optim
# Statistics
using Random, Distributions
# Data Wrangling
using Queryverse, DataFrames, HTTP, CSV
final_time = 20 # day up to which calibration will be performed
# Model parameters
β = 0.05 # infection rate
λ_R = 0.55 # inverse of transition time from infected to recovered
λ_D = 0.83 # inverse of transition time from infected to dead
i₀ = 0.075 # fraction of initial infected people in every age class
𝒫 = vcat([β, λ_R, λ_D]...)
# regional contact matrix and regional population
## regional contact matrix
regional_all_contact_matrix = [3.45536 0.485314 0.506389 0.123002 ; 0.597721 2.11738 0.911374 0.323385 ; 0.906231 1.35041 1.60756 0.67411 ; 0.237902 0.432631 0.726488 0.979258] # 4x4 contact matrix
## regional population stratified by age
N= [723208 , 874150, 1330993, 1411928] # array of 4 elements, each of which representing the absolute amount of population in the corresponding age class.
# Initial conditions
I₀ = repeat([i₀],4)
S₀ = N.-I₀
R₀ = [0.0 for n in 1:length(N)]
D₀ = [0.0 for n in 1:length(N)]
D_tot₀ = [0.0 for n in 1:length(N)]
ℬ = vcat([S₀, I₀, R₀, D₀, D_tot₀]...)
# Time
𝒯 = (1.0,final_time);
# data used for calibration
piedmont_cumulative_deaths = [0,2,4,5,5,13,17,21,26,46,59,81,111,133,154,175,209,238,283,315]
function SIRD!(du,u,p,t)
# Parameters to be calibrated
β, λ_R, λ_D = p
# initializa this parameter (death probability stratified by age, taken from literature)
δ₁, δ₂, δ₃, δ₄ = [0.003/100, 0.004/100, (0.015+0.030+0.064+0.213+0.718)/(5*100), (2.384+8.466+12.497+1.117)/(4*100)]
δ = vcat(repeat([δ₁],1),repeat([δ₂],1),repeat([δ₃],1),repeat([δ₄],4-1-1-1))
C = regional_all_contact_matrix
# State variables
S = @view u[4*0+1:4*1]
I = @view u[4*1+1:4*2]
R = @view u[4*2+1:4*3]
D = @view u[4*3+1:4*4]
D_tot = @view u[4*4+1:4*5]
# Differentials
dS = @view du[4*0+1:4*1]
dI = @view du[4*1+1:4*2]
dR = @view du[4*2+1:4*3]
dD = @view du[4*3+1:4*4]
dD_tot = @view du[4*4+1:4*5]
# Force of infection
Λ = β*[sum([C[i,j]*I[j]/N[j] for j in 1:size(C)[1]]) for i in 1:size(C)[2]]
# System of equations
@. dS = -Λ*S
@. dI = Λ*S - ((1-δ)*λ_R + δ*λ_D)*I
@. dR = λ_R*(1-δ)*I
@. dD = λ_D*δ*I
@. dD_tot = dD[1]+dD[2]+dD[3]+dD[4]
end;
# generate fake piedmont_cumulative_recovered
problem = ODEProblem(SIRD!, ℬ, 𝒯, 𝒫)
solution_all = @time solve(problem, saveat=1:final_time);
t = collect(1.:final_time);
I would like to claibrate parameters [β, λ_R, λ_D] with respect to piedmont_cumulative_deaths
, following the above tutorial. I tried the following two approaches:
Approach 1
In this approach, noting that in the tutorial there is a fit_mle
for every time step, I try to replicate it here. The result is an array of distributions
with 0 variance , Nan
fitness , constant impove/step
and somewhat realistic parameter values.
using BlackBoxOptim
aggregate_data = piedmont_cumulative_deaths
distributions = [fit_mle(Normal,[aggregate_data[i]]) for i in 1:length(t)]
obj = build_loss_objective(problem,Tsit5(),LogLikeLoss(t,distributions),
maxiters=10000,verbose=true, save_idxs= [4*4+1])
bound1 = Tuple{Float64, Float64}[(0.01, 1.0),(0.01, 1.0),(0.01, 1.0)]
result = bboptimize(obj;SearchRange = bound1, MaxSteps = 11e3)
Starting optimization with optimizer DiffEvoOpt{FitPopulation{Float64},RadiusLimitedSelector,BlackBoxOptim.AdaptiveDiffEvoRandBin{3},RandomBound{ContinuousRectSearchSpace}}
0.00 secs, 0 evals, 0 steps
0.52 secs, 536 evals, 268 steps, improv/step: 1.000 (last = 1.0000), fitness=NaN
1.03 secs, 1086 evals, 543 steps, improv/step: 1.000 (last = 1.0000), fitness=NaN
1.53 secs, 1550 evals, 775 steps, improv/step: 1.000 (last = 1.0000), fitness=NaN
2.03 secs, 1964 evals, 982 steps, improv/step: 1.000 (last = 1.0000), fitness=NaN
2.53 secs, 2500 evals, 1250 steps, improv/step: 1.000 (last = 1.0000), fitness=NaN
3.03 secs, 2988 evals, 1494 steps, improv/step: 1.000 (last = 1.0000), fitness=NaN
3.53 secs, 3490 evals, 1745 steps, improv/step: 1.000 (last = 1.0000), fitness=NaN
4.03 secs, 4012 evals, 2006 steps, improv/step: 1.000 (last = 1.0000), fitness=NaN
4.54 secs, 4544 evals, 2272 steps, improv/step: 1.000 (last = 0.9962), fitness=NaN
5.04 secs, 5060 evals, 2530 steps, improv/step: 1.000 (last = 1.0000), fitness=NaN
5.54 secs, 5596 evals, 2798 steps, improv/step: 1.000 (last = 1.0000), fitness=NaN
6.04 secs, 6064 evals, 3032 steps, improv/step: 0.999 (last = 0.9957), fitness=NaN
6.54 secs, 6548 evals, 3274 steps, improv/step: 0.999 (last = 1.0000), fitness=NaN
7.04 secs, 7066 evals, 3533 steps, improv/step: 0.999 (last = 1.0000), fitness=NaN
7.54 secs, 7638 evals, 3819 steps, improv/step: 0.999 (last = 1.0000), fitness=NaN
8.05 secs, 8166 evals, 4083 steps, improv/step: 0.999 (last = 0.9962), fitness=NaN
8.55 secs, 8718 evals, 4359 steps, improv/step: 0.999 (last = 1.0000), fitness=NaN
9.05 secs, 9184 evals, 4592 steps, improv/step: 0.999 (last = 1.0000), fitness=NaN
9.55 secs, 9628 evals, 4814 steps, improv/step: 0.999 (last = 1.0000), fitness=NaN
10.05 secs, 10110 evals, 5055 steps, improv/step: 0.999 (last = 1.0000), fitness=NaN
10.55 secs, 10634 evals, 5317 steps, improv/step: 0.999 (last = 1.0000), fitness=NaN
11.05 secs, 11116 evals, 5558 steps, improv/step: 0.999 (last = 1.0000), fitness=NaN
11.56 secs, 11644 evals, 5822 steps, improv/step: 0.999 (last = 1.0000), fitness=NaN
12.06 secs, 12146 evals, 6073 steps, improv/step: 1.000 (last = 1.0000), fitness=NaN
12.56 secs, 12590 evals, 6295 steps, improv/step: 1.000 (last = 1.0000), fitness=NaN
13.06 secs, 13060 evals, 6530 steps, improv/step: 1.000 (last = 1.0000), fitness=NaN
13.56 secs, 13586 evals, 6793 steps, improv/step: 1.000 (last = 1.0000), fitness=NaN
14.06 secs, 14060 evals, 7030 steps, improv/step: 1.000 (last = 1.0000), fitness=NaN
14.57 secs, 14588 evals, 7294 steps, improv/step: 1.000 (last = 1.0000), fitness=NaN
15.07 secs, 15110 evals, 7555 steps, improv/step: 1.000 (last = 1.0000), fitness=NaN
15.57 secs, 15630 evals, 7815 steps, improv/step: 1.000 (last = 1.0000), fitness=NaN
16.07 secs, 16174 evals, 8087 steps, improv/step: 1.000 (last = 1.0000), fitness=NaN
16.57 secs, 16670 evals, 8335 steps, improv/step: 1.000 (last = 1.0000), fitness=NaN
17.07 secs, 17160 evals, 8580 steps, improv/step: 1.000 (last = 1.0000), fitness=NaN
17.57 secs, 17734 evals, 8867 steps, improv/step: 1.000 (last = 1.0000), fitness=NaN
18.07 secs, 18256 evals, 9128 steps, improv/step: 1.000 (last = 1.0000), fitness=NaN
18.58 secs, 18748 evals, 9374 steps, improv/step: 1.000 (last = 1.0000), fitness=NaN
19.08 secs, 19300 evals, 9650 steps, improv/step: 1.000 (last = 1.0000), fitness=NaN
19.58 secs, 19878 evals, 9939 steps, improv/step: 1.000 (last = 1.0000), fitness=NaN
20.08 secs, 20390 evals, 10195 steps, improv/step: 1.000 (last = 1.0000), fitness=NaN
20.59 secs, 20918 evals, 10459 steps, improv/step: 1.000 (last = 1.0000), fitness=NaN
21.09 secs, 21394 evals, 10697 steps, improv/step: 1.000 (last = 1.0000), fitness=NaN
21.59 secs, 21904 evals, 10952 steps, improv/step: 1.000 (last = 1.0000), fitness=NaN
Optimization stopped after 11001 steps and 21.68 seconds
Termination reason: Max number of steps (11000) reached
Steps per second = 507.50
Function evals per second = 1014.99
Improvements/step = 0.99982
Total function evaluations = 22002
Best candidate found: [0.0609603, 0.0286277, 0.0136111]
Fitness: NaN
Approach 2
To avoid the null variance, I fit only one distribution on all data. This still leads to not realistic parameter values.
using BlackBoxOptim
aggregate_data = piedmont_cumulative_deaths
distributions = [fit_mle(Normal,aggregate_data)]
obj = build_loss_objective(problem,Tsit5(),LogLikeLoss([1],distributions),
maxiters=10000,verbose=true, save_idxs= [4*4+1])
bound1 = Tuple{Float64, Float64}[(0.01, 1.0),(0.01, 1.0),(0.01, 1.0)]
result = bboptimize(obj;SearchRange = bound1, MaxSteps = 11e3)
Starting optimization with optimizer DiffEvoOpt{FitPopulation{Float64},RadiusLimitedSelector,BlackBoxOptim.AdaptiveDiffEvoRandBin{3},RandomBound{ContinuousRectSearchSpace}}
0.00 secs, 0 evals, 0 steps
0.50 secs, 550 evals, 501 steps, improv/step: 0.998 (last = 0.9980), fitness=5.973421797
1.00 secs, 1045 evals, 996 steps, improv/step: 0.999 (last = 1.0000), fitness=5.973421797
1.50 secs, 1558 evals, 1509 steps, improv/step: 0.999 (last = 1.0000), fitness=5.973421797
2.00 secs, 2059 evals, 2010 steps, improv/step: 1.000 (last = 1.0000), fitness=5.973421797
2.50 secs, 2535 evals, 2486 steps, improv/step: 1.000 (last = 1.0000), fitness=5.973421797
3.01 secs, 3067 evals, 3019 steps, improv/step: 0.999 (last = 0.9981), fitness=5.973421797
3.51 secs, 3592 evals, 3544 steps, improv/step: 0.999 (last = 1.0000), fitness=5.973421797
4.01 secs, 4079 evals, 4031 steps, improv/step: 1.000 (last = 1.0000), fitness=5.973421797
4.51 secs, 4602 evals, 4555 steps, improv/step: 0.999 (last = 0.9981), fitness=5.973421797
5.01 secs, 5205 evals, 5158 steps, improv/step: 0.999 (last = 1.0000), fitness=5.973421797
5.51 secs, 5779 evals, 5732 steps, improv/step: 0.999 (last = 1.0000), fitness=5.973421797
6.01 secs, 6295 evals, 6248 steps, improv/step: 1.000 (last = 1.0000), fitness=5.973421797
6.52 secs, 6779 evals, 6732 steps, improv/step: 1.000 (last = 1.0000), fitness=5.973421797
7.02 secs, 7293 evals, 7246 steps, improv/step: 1.000 (last = 1.0000), fitness=5.973421797
7.52 secs, 7814 evals, 7767 steps, improv/step: 1.000 (last = 1.0000), fitness=5.973421797
8.02 secs, 8317 evals, 8270 steps, improv/step: 1.000 (last = 1.0000), fitness=5.973421797
8.52 secs, 8898 evals, 8851 steps, improv/step: 1.000 (last = 1.0000), fitness=5.973421797
9.02 secs, 9408 evals, 9361 steps, improv/step: 1.000 (last = 1.0000), fitness=5.973421797
9.53 secs, 9898 evals, 9851 steps, improv/step: 1.000 (last = 1.0000), fitness=5.973421797
10.03 secs, 10436 evals, 10390 steps, improv/step: 1.000 (last = 0.9981), fitness=5.973421797
10.53 secs, 10961 evals, 10915 steps, improv/step: 1.000 (last = 1.0000), fitness=5.973421797
Optimization stopped after 11001 steps and 10.63 seconds
Termination reason: Max number of steps (11000) reached
Steps per second = 1034.51
Function evals per second = 1038.84
Improvements/step = 0.99973
Total function evaluations = 11047
Best candidate found: [0.717621, 0.9427, 0.0494182]
Fitness: 5.973421797
So does it make sense to use this feature when one has data in a one dimensional array?
If so, how may one use it properly?
Thank you