Hello people,
I’ve been using Turing to infer parameters of an ODE model for close to a year. But since recently, I am facing the following non-reproducible error from the code, which I haven’t encountered previously:
Sampling (2 processes) 0%| | ETA: N/A
Sampling (2 processes) 100%|████████████████████████████| Time: 0:26:00
ERROR: LoadError: TaskFailedException
nested task error: On worker 2:
DimensionMismatch("arrays could not be broadcast to a common size; got a dimension with lengths 0 and 11")
Following is an MWE:
#=
Section 1: Importing packages
=#
using Distributed
using Turing
num_chains = 2;
addprocs(num_chains)
@everywhere using DifferentialEquations, Interpolations, XLSX, DataFrames
@everywhere using Distributions, DistributionsAD, MCMCChains, Turing
@everywhere using LinearAlgebra, Random
@everywhere Random.seed!(18431);
@everywhere using ForwardDiff, Preferences
@everywhere Turing.setadbackend(:forwarddiff)
@everywhere set_preferences!(ForwardDiff, "nansafe_mode" => true)
# Time period for simulation of ODEs
@everywhere tspan = (1.0, 11.0)
# Observation data
@everywhere actual_data = [
10.0 0.0 0.0 0.0;
10.55 0.15 0.01 0.01;
11.36 0.09 0.01 0.01;
12.12 0.05 0.01 0.01;
13.06 0.03 0.01 0.01;
14.79 0.04 0.01 0.01;
19.58 0.08 0.01 0.01;
39.94 0.27 0.03 0.02;
172.38 1.56 0.16 0.04;
1489.52 13.98 1.40 0.20;
20918.01 191.95 19.20 2.18
]
#=
Section 2: ODE model
=#
@everywhere function ODE_model!(dy, y, p, t)
N = 1_000_000; # A constant
@inbounds begin
dy[1] = - p[1] * y[1] * y[3] / N
dy[2] = p[1] * y[1] * y[3] / N - p[2] * y[2]
dy[3] = p[2] * y[2] - p[3] * y[3]
dy[4] = p[3] * (1 - p[7]) * (1 - p[6]) * y[3] - p[5] * y[4]
dy[5] = p[3] * p[7] * (1 - p[6]) * y[3] - p[4] * y[5]
dy[6] = p[3] * p[6] * y[3] + p[5] * y[4] + p[4] * (1 - p[8]) * y[5]
#=
Compartments used for inference
=#
dy[7] = p[4] * p[8] * y[5]
dy[8] = p[2] * y[2]
dy[9] = p[3] * (1 - p[6]) * y[3] - p[5] * y[4] - p[4] * y[5]
dy[10] = p[3] * p[7] * (1 - p[6]) * y[3] - p[4] * y[5]
end
return nothing
end
#=
Section 3: Define the observation model
=#
@everywhere @model function inference_function!(data, ODEtspan)
# Priors of ODE model parameters
β ~ LogNormal(log(2.8), 0.5)
α ~ LogNormal(log(7/6.5), 0.5)
θ ~ LogNormal(log(7/3.2), 0.5)
γᵢ ~ LogNormal(log(7/10), 0.5)
γₙ ~ LogNormal(log(7/8), 0.5)
ϕᵣ ~ Beta(5.0, 1.5)
ϕᵢ ~ Beta(1.5, 5.0)
ϕₖ ~ Beta(1.5, 5.0)
param_prior = [β, α, θ, γᵢ, γₙ, ϕᵣ, ϕᵢ, ϕₖ]
# Priors of observation model parameters
σ₁ ~ Gamma(1.0, 5.0)
σ₂ ~ Gamma(1.0, 5.0)
σ₃ ~ Gamma(1.0, 5.0)
σ₄ ~ Gamma(1.0, 5.0)
# Defining the initial conditions and the problem
u0 = zeros(10) # Total 10 variables in the ODE model
u0[1] = 1_000_000 - 10 # Value of N - 10
u0[3] = 10
u0[8] = 10
inference_problem = ODEProblem(ODE_model!, u0, ODEtspan, param_prior)
# Solve the ODE model
inference_solution = solve(inference_problem, AutoVern7(Rodas5()), saveat = 1.0, save_everystep=false)
# Inference using student's t-distributions
data[2:11,1] ~ arraydist(LocationScale.(inference_solution[7,2:11], σ₁.*sqrt.(data[2:11,1]), TDist.(4)))
data[2:11,2] ~ arraydist(LocationScale.(inference_solution[8,2:11], σ₂.*sqrt.(data[2:11,2]), TDist.(4)))
data[2:11,3] ~ arraydist(LocationScale.(inference_solution[9,2:11], σ₃.*sqrt.(data[2:11,3]), TDist.(4)))
data[2:11,4] ~ arraydist(LocationScale.(inference_solution[10,2:11], σ₄.*sqrt.(data[2:11,4]), TDist.(4)))
end
#=
Section 4: Run the inference procedure and save the chains
=#
@everywhere inference_model = inference_function!(actual_data, tspan)
chains = sample(inference_model, NUTS(2000, 0.65), MCMCDistributed(), 5000, num_chains; progress=true)
Stacktrace shows that the error message is thrown by the inference lines, where I am using the T-distribution with arraydist
.
By ‘non-reproducible’ error, I mean that the error is unpredictable. Sometimes, all the chains finish sampling without the above error, but sometimes, they do not. Sometimes, even the dimensions of the solution object inference_solution
varies, though I am enforcing saveat
while solving, resulting in error "arrays could not be broadcast to a common size; got a dimension with lengths x and 11"
where x
is unpredictable; here I am showing the case for x
of 0.
The code was originally written not more than 6 months ago. I am using Julia v1.7.2
on a CentOS v7 HPC, and the following are the packages:
Status `~/.julia/environments/v1.7/Project.toml`
[a93c6f00] DataFrames v1.3.4
[0c46a032] DifferentialEquations v7.0.0
[31c24e10] Distributions v0.25.60
[ced4e74d] DistributionsAD v0.6.40
[f6369f11] ForwardDiff v0.10.30
[a98d9a8b] Interpolations v0.13.6
[c7f686f2] MCMCChains v5.3.0
[21216c6a] Preferences v1.3.0
[fce5fe82] Turing v0.20.4
[fdbf4ff8] XLSX v0.7.10
[8ba89e20] Distributed
[9a3f8284] Random
Please help me out as I am befuddled with this error which was not at all present since a week ago.