Our group is interested in fitting unknown parameters in our multi-physics model using the variational inference package AdvancedVI
. To fit the parameters, we use experimental data from multiple tests and run a dynamical simulation for each test. Since these simulations can take up to a half hour each, we would like to use the pmap
function inside the Turing @model function to speed up the process.
The code below is a toy-model representation of the script we are using. A good chunk of it is borrowed from the Turing.jl webpage:
using Distributed
@everywhere using Turing, ForwardDiff, OrdinaryDiffEq, StatsPlots,
AdvancedVI, LinearAlgebra, Random
@everywhere Random.seed!(1)
using Plots; plotly()
# toggle autodiff on/off in ODE solver
@everywhere autodiff = true
# toy ODE system
@everywhere function lotka_volterra(du, u, p, t)
Ξ±, Ξ², Ξ³, Ξ΄ = p
x, y = u
du[1] = (Ξ± - Ξ²*y) * x
du[2] = (Ξ΄*x - Ξ³) * y
return nothing
end
# simulation output
@everywhere function gen_predict(p, jobid)
@info "Started job $jobid"
u0 = [1.0, 1.0]
tspan = (0.0, 10.0)
prob = ODEProblem(lotka_volterra, u0, tspan, p)
sol = solve(prob, TRBDF2(; autodiff); saveat = 0.1)
return vcat(sol.u...)
end
# note: ADVI requires @everywhere to be placed here
@everywhere @model function fit_lv(data)
# sampling
ΟΒ² ~ filldist(truncated(Normal(0.05, 0.0125); lower = 0, upper = 1.0), 1)
Ξ± ~ filldist(truncated(Normal(1.5, 0.5); lower = 0.5, upper = 2.5), 1)
Ξ² ~ filldist(truncated(Normal(1.2, 0.5); lower = 0, upper = 2), 1)
Ξ³ ~ filldist(truncated(Normal(3.0, 0.5); lower = 1, upper = 4), 1)
Ξ΄ ~ filldist(truncated(Normal(1.0, 0.5); lower = 0, upper = 2), 1)
p = [Ξ±[1], Ξ²[1], Ξ³[1], Ξ΄[1]]
# distribute simulation runs over two tests
joblist = 1:2
predict = pmap(jobid -> gen_predict(p, jobid), joblist)
y_sim = vcat(predict...)
# simulation vs experiment
data ~ MvNormal(y_sim, ΟΒ²[1] * I)
return nothing
end
# target parameter values
p = [1.5, 1.0, 3.0, 1.0]
# generate mock data points covering two tests
u0 = [1.0, 1.0]
tspan = (0.0, 10.0)
prob = ODEProblem(lotka_volterra, u0, tspan, p)
sol = solve(prob, TRBDF2(; autodiff); saveat = 0.1)
u = vcat(sol.u...)
data = vcat(0.9.*u..., 1.1.*u...)
# configure inference model and fit parameters
model = fit_lv(data)
advi = ADVI(1, 1000)
optimizer = Turing.Variational.TruncatedADAGrad(0.01, 1.0, 10)
@time res = vi(model, advi; optimizer)
println("\ndone")
Inside the @model function fit_lv
is where we want to apply pmap
to distribute the simulation runs over two test days while sampling the parameters serially. In the simulation function gen_predict
, we generally use the solver TRBDF2(autodiff = true)
with forward-mode auto-diff since our multi-physics model is stiff.
This script works fine if you run it on the main thread via julia
. However, if you run it with julia -p 2
you get the following error on the solve
line once ADVI
is activated (Julia version is 1.8.5)
julia> include("sample_turing.jl")
β Info: [ADVI] Should only be seen once: optimizer created for ΞΈ
β objectid(ΞΈ) = 0x914e03f08498020d
ERROR: LoadError: On worker 3:
First call to automatic differentiation for the Jacobian
failed. This means that the user `f` function is not compatible
with automatic differentiation. Methods to fix this include:
1. Turn off automatic differentiation (e.g. Rosenbrock23() becomes
Rosenbrock23(autodiff=false)). More details can befound at
https://docs.sciml.ai/DiffEqDocs/stable/features/performance_overloads/
2. Improving the compatibility of `f` with ForwardDiff.jl automatic
differentiation (using tools like PreallocationTools.jl). More details
can be found at https://docs.sciml.ai/DiffEqDocs/stable/basics/faq/#Autodifferentiation-and-Dual-Numbers
3. Defining analytical Jacobians. More details can be
found at https://docs.sciml.ai/DiffEqDocs/stable/types/ode_types/#SciMLBase.ODEFunction
Note: turning off automatic differentiation tends to have a very minimal
performance impact (for this use case, because it's forward mode for a
square Jacobian. This is different from optimization gradient scenarios).
However, one should be careful as some methods are more sensitive to
accurate gradients than others. Specifically, Rodas methods like `Rodas4`
and `Rodas5P` require accurate Jacobians in order to have good convergence,
while many other methods like BDF (`QNDF`, `FBDF`), SDIRK (`KenCarp4`),
and Rosenbrock-W (`Rosenbrock23`) do not. Thus if using an algorithm which
is sensitive to autodiff and solving at a low tolerance, please change the
algorithm as well.
"No matching function wrapper was found!"
Stacktrace:
[1] jacobian!
@ ~/.julia/packages/OrdinaryDiffEq/CWSFV/src/derivative_wrappers.jl:230
[2] calc_J!
@ ~/.julia/packages/OrdinaryDiffEq/CWSFV/src/derivative_utils.jl:144 [inlined]
[3] calc_W!
@ ~/.julia/packages/OrdinaryDiffEq/CWSFV/src/derivative_utils.jl:691
[4] update_W!
@ ~/.julia/packages/OrdinaryDiffEq/CWSFV/src/derivative_utils.jl:799 [inlined]
[5] update_W!
@ ~/.julia/packages/OrdinaryDiffEq/CWSFV/src/derivative_utils.jl:798 [inlined]
[6] nlsolve!
@ ~/.julia/packages/OrdinaryDiffEq/CWSFV/src/nlsolve/nlsolve.jl:25
[7] perform_step!
@ ~/.julia/packages/OrdinaryDiffEq/CWSFV/src/perform_step/sdirk_perform_step.jl:481
[8] perform_step!
@ ~/.julia/packages/OrdinaryDiffEq/CWSFV/src/perform_step/sdirk_perform_step.jl:458 [inlined]
[9] solve!
@ ~/.julia/packages/OrdinaryDiffEq/CWSFV/src/solve.jl:520
[10] #__solve#618
@ ~/.julia/packages/OrdinaryDiffEq/CWSFV/src/solve.jl:6
[11] #solve_call#22
@ ~/.julia/packages/DiffEqBase/190F1/src/solve.jl:494 [inlined]
[12] #solve_up#29
@ ~/.julia/packages/DiffEqBase/190F1/src/solve.jl:915
[13] #solve#27
@ ~/.julia/packages/DiffEqBase/190F1/src/solve.jl:825
You get the same error if you used the Bayesian inference model NUTS
with MCMCSerial
res = sample(model, NUTS(0.65), MCMCSerial(), 100, 1)
On Julia 1.7.3, you get a dimension error when performing floating point operations with dual numbers (e.g. in lokta_volterra
). We suspect there is a dual number configuration issue when trying to use pmap
on an auto-diff compatible solver within the Turing @model. When we set autodiff = false
, the above code is able to run with the distributed pmap
but we lose out on solver robustness (risking simulation crashes and potentially skewing the parameter fit)
If someone knows how to fix this bug where we can use both pmap
and autodiff = true
, we would greatly appreciate it!