# `First function call produced NaNs` when using NUTS but not MH

Hello everyone,

I’m having trouble fitting a Cucker-Smaile ODE model with Turing.

\Psi is positive, non-increasing and smooth, in my model `ψ(r, K, β) = K / ((1 + r^2)^β)`.

Even though I can solve the ODE system fine, when I plug it into Turing and use NUTS as opposed to MH, I run into instability issues. Namely, it looks like it fails at the `gradient_logp` stage. I’m not sure what would cause this. I’ve tried asking for the suspect parameters and then solving the model at these inputs, but I run into no issues in this case (for these I also ensured that the initial positions and velocities were unchanged). Is there anything in the equations that might explain this?

Here is a MWE:

``````using DifferentialEquations
using Turing
using Distributions
using StaticArrays

Psi(r, K, β) = K / ((1 + r^2)^β) # Communication kernel
twonorm(x) = sqrt(sum(abs2, x))

function cuckersmale!(du, u, p, t)
N, K, β = p
for i in 1:N
xi = SVector(u[1, i], u[2, i])
vi = SVector(u[3, i], u[4, i])
totx, toty = zero(eltype(u)), zero(eltype(u))
for j in 1:N
xj = SVector(u[1, j], u[2, j])
vj = SVector(u[3, j], u[4, j])
x = Psi(twonorm(xj - xi), K, β) .* (vj - vi)
totx += x[1]
toty += x[2]
end
du[3, i] = totx / N
du[4, i] = toty / N
du[1, i] = u[3, i] + du[3, i]
du[2, i] = u[4, i] + du[4, i]
end
return nothing
end

@model function fit_cucker_smaile(data, cucker_smaile_problem, problem_p, global_p)
β ~ Uniform(0.0, 1.0)
K ~ InverseGamma(2.0, 3.0)
var ~ InverseGamma(2.0, 3.0)
(alg, save_every) = global_p
(N, _, _) = problem_p
new_problem_p = (N, K, β)
prob = remake(cucker_smaile_problem, p=new_problem_p, u0=convert(Matrix{typeof(var)}, cucker_smaile_problem.u0))
sol = solve(prob, alg, saveat=save_every)
predicted = vec(sol)
data .~ Turing.Normal.(predicted, var)
end

function main()
# SETUP
# Agent state is (xᵢ(t), vᵢ(t)) ∈ ℝᴺ × ℝᴺ
N = 30 # Number of agents
u0 = rand(Uniform(-1.0, 1.0), 4, N)
β = 0.3
K = 0.9
p = (N, K, β)
tspan = (0.0, 5.0)

alg = RK4()
save_every = 0.1
global_p = (alg, save_every)

# CONSTRUCT PROBLEM AND SOLVE
prob = ODEProblem(cuckersmale!, u0, tspan, p)
sol = solve(prob, alg, saveat=save_every)
sol_data = vec(sol)
model = fit_cucker_smaile(sol_data, prob, p, global_p)
chain = sample(model, NUTS(), 300)
end

main()
``````

Here’s the warning.

``````┌ Warning: First function call produced NaNs. Exiting. Double check that none of the initial conditions, parameters, or timespan values are NaN.
└ @ OrdinaryDiffEq C:\Users\jmmsm\.julia\packages\OrdinaryDiffEq\ZgJ9s\src\initdt.jl:121
┌ Warning: dt(2.220446049250313e-16) <= dtmin(2.220446049250313e-16) at t=0.0. Aborting. There is either an error in your model specification or the true solution is unstable.
└ @ SciMLBase C:\Users\jmmsm\.julia\packages\SciMLBase\UEAKN\src\integrator_interface.jl:366
ERROR: DimensionMismatch("arrays could not be broadcast to a common size; got a dimension with lengths 120 and 6120")
``````

and the full stacktrace:

1 Like

How did you do this? Did you `@show prob` and then run that outside of the model and not get it?

I haven’t tried this, I was just using the parameters themselves. That said I’m not able to do so due to not knowing which package to load to fix this error:

``````ERROR: MethodError: no method matching Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}()
Closest candidates are:
Base.Pairs{K, V, I, A}(::Any, ::Any) where {K, V, I, A} at C:\Users\jmmsm\AppData\Local\Programs\Julia-1.7.2\share\julia\base\essentials.jl:33
Stacktrace:
[1] top-level scope
@ c:\Users\jmmsm\Documents\GitHub\cucker-smaile\mwe.jl:72
``````

Edit: the Float `prob`

``````prob = ODEProblem{Matrix{Float64}, Tuple{Float64, Float64}, true, Tuple{Int64, Float64, Float64}, ODEFunction{true, typeof(cuckersmale!), LinearAlgebra.UniformScaling{Bool}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, typeof(SciMLBase.DEFAULT_OBSERVED), Nothing}, Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}, SciMLBase.StandardODEProblem}(ODEFunction{true, typeof(cuckersmale!), LinearAlgebra.UniformScaling{Bool}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, typeof(SciMLBase.DEFAULT_OBSERVED), Nothing}(cuckersmale!, LinearAlgebra.UniformScaling{Bool}(true), nothing, nothing, nothing, nothing, nothing, nothing, nothing, nothing, nothing, nothing, nothing, nothing, SciMLBase.DEFAULT_OBSERVED, nothing), [-0.2897659270649322 0.729591137130186; 0.9114431725115799 0.9394905209496023; 0.202227078380808 0.11659462501297368; 0.9389245625616041 -0.3645827911463726], (0.0, 5.0), (2, 0.3406548813088736, 0.6351259651692638), Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}(), SciMLBase.StandardODEProblem())
``````

Dual `prob`

``````prob = ODEProblem{Matrix{ForwardDiff.Dual{ForwardDiff.Tag{Turing.Essential.var"#f#4"{DynamicPPL.TypedVarInfo{NamedTuple{(:β, :K, :var), Tuple{DynamicPPL.Metadata{Dict{AbstractPPL.VarName{:β, Setfield.IdentityLens}, Int64}, Vector{Uniform{Float64}}, Vector{AbstractPPL.VarName{:β, Setfield.IdentityLens}}, Vector{Float64}, Vector{Set{DynamicPPL.Selector}}}, DynamicPPL.Metadata{Dict{AbstractPPL.VarName{:K, Setfield.IdentityLens}, Int64}, Vector{InverseGamma{Float64}}, Vector{AbstractPPL.VarName{:K, Setfield.IdentityLens}}, Vector{Float64}, Vector{Set{DynamicPPL.Selector}}}, DynamicPPL.Metadata{Dict{AbstractPPL.VarName{:var, Setfield.IdentityLens}, Int64}, Vector{InverseGamma{Float64}}, Vector{AbstractPPL.VarName{:var, Setfield.IdentityLens}}, Vector{Float64}, Vector{Set{DynamicPPL.Selector}}}}}, Float64}, DynamicPPL.Model{typeof(fit_cucker_smaile), (:data, :cucker_smaile_problem, :problem_p, :global_p), (), (), Tuple{Vector{Float64}, ODEProblem{Matrix{Float64}, Tuple{Float64, Float64}, true, Tuple{Int64, Float64, Float64}, ODEFunction{true, typeof(cuckersmale!), LinearAlgebra.UniformScaling{Bool}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, typeof(SciMLBase.DEFAULT_OBSERVED), Nothing}, Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}, SciMLBase.StandardODEProblem}, Tuple{Int64, Float64, Float64}, Tuple{RK4, Float64}}, Tuple{}, DynamicPPL.DefaultContext}, DynamicPPL.Sampler{NUTS{Turing.Essential.ForwardDiffAD{0}, (), AdvancedHMC.DiagEuclideanMetric}},
DynamicPPL.Sampler{NUTS{Turing.Essential.ForwardDiffAD{0}, (), AdvancedHMC.DiagEuclideanMetric}}, DynamicPPL.DefaultContext}, Float64}}(0.9114431725115799,0.0,0.0,0.0) Dual{ForwardDiff.Tag{Turing.Essential.var"#f#4"{DynamicPPL.TypedVarInfo{NamedTuple{(:β, :K, :var), Tuple{DynamicPPL.Metadata{Dict{AbstractPPL.VarName{:β, Setfield.IdentityLens}, Int64}, Vector{Uniform{Float64}}, Vector{AbstractPPL.VarName{:β, Setfield.IdentityLens}}, Vector{Float64}, Vector{Set{DynamicPPL.Selector}}}, DynamicPPL.Metadata{Dict{AbstractPPL.VarName{:K, Setfield.IdentityLens}, Int64}, Vector{InverseGamma{Float64}}, Vector{AbstractPPL.VarName{:K, Setfield.IdentityLens}}, Vector{Float64}, Vector{Set{DynamicPPL.Selector}}}, DynamicPPL.Metadata{Dict{AbstractPPL.VarName{:var, Setfield.IdentityLens}, Int64}, Vector{InverseGamma{Float64}}, Vector{AbstractPPL.VarName{:var, Setfield.IdentityLens}}, Vector{Float64}, Vector{Set{DynamicPPL.Selector}}}}}, Float64}, DynamicPPL.Model{typeof(fit_cucker_smaile), (:data, :cucker_smaile_problem, :problem_p,
Setfield.IdentityLens}, Int64}, Vector{InverseGamma{Float64}}, Vector{AbstractPPL.VarName{:var, Setfield.IdentityLens}}, Vector{Float64}, Vector{Set{DynamicPPL.Selector}}}}}, Float64}, DynamicPPL.Model{typeof(fit_cucker_smaile), (:data, :cucker_smaile_problem, :problem_p, :global_p), (), (), Tuple{Vector{Float64}, ODEProblem{Matrix{Float64},
Tuple{Float64, Float64}, true, Tuple{Int64, Float64, Float64}, ODEFunction{true, typeof(cuckersmale!), LinearAlgebra.UniformScaling{Bool}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, typeof(SciMLBase.DEFAULT_OBSERVED), Nothing}, Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}, SciMLBase.StandardODEProblem}, Tuple{Int64, Float64, Float64}, Tuple{RK4, Float64}}, Tuple{}, DynamicPPL.DefaultContext}, DynamicPPL.Sampler{NUTS{Turing.Essential.ForwardDiffAD{0}, (), AdvancedHMC.DiagEuclideanMetric}}, DynamicPPL.DefaultContext}, Float64}}(0.9389245625616041,0.0,0.0,0.0) Dual{ForwardDiff.Tag{Turing.Essential.var"#f#4"{DynamicPPL.TypedVarInfo{NamedTuple{(:β, :K, :var), Tuple{DynamicPPL.Metadata{Dict{AbstractPPL.VarName{:β, Setfield.IdentityLens}, Int64}, Vector{Uniform{Float64}}, Vector{AbstractPPL.VarName{:β, Setfield.IdentityLens}}, Vector{Float64}, Vector{Set{DynamicPPL.Selector}}}, DynamicPPL.Metadata{Dict{AbstractPPL.VarName{:K, Setfield.IdentityLens}, Int64}, Vector{InverseGamma{Float64}}, Vector{AbstractPPL.VarName{:K, Setfield.IdentityLens}}, Vector{Float64}, Vector{Set{DynamicPPL.Selector}}}, DynamicPPL.Metadata{Dict{AbstractPPL.VarName{:var, Setfield.IdentityLens}, Int64}, Vector{InverseGamma{Float64}}, Vector{AbstractPPL.VarName{:var, Setfield.IdentityLens}}, Vector{Float64}, Vector{Set{DynamicPPL.Selector}}}}}, Float64}, DynamicPPL.Model{typeof(fit_cucker_smaile), (:data, :cucker_smaile_problem, :problem_p, :global_p), (), (), Tuple{Vector{Float64}, ODEProblem{Matrix{Float64}, Tuple{Float64, Float64}, true, Tuple{Int64, Float64, Float64}, ODEFunction{true, typeof(cuckersmale!), LinearAlgebra.UniformScaling{Bool},
Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, typeof(SciMLBase.DEFAULT_OBSERVED), Nothing}, Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}, SciMLBase.StandardODEProblem}, Tuple{Int64, Float64, Float64}, Tuple{RK4, Float64}}, Tuple{}, DynamicPPL.DefaultContext}, DynamicPPL.Sampler{NUTS{Turing.Essential.ForwardDiffAD{0}, (), AdvancedHMC.DiagEuclideanMetric}}, DynamicPPL.DefaultContext}, Float64}}(-0.3645827911463726,0.0,0.0,0.0)], (0.0,
5.0), (2, Dual{ForwardDiff.Tag{Turing.Essential.var"#f#4"{DynamicPPL.TypedVarInfo{NamedTuple{(:β, :K, :var), Tuple{DynamicPPL.Metadata{Dict{AbstractPPL.VarName{:β, Setfield.IdentityLens}, Int64}, Vector{Uniform{Float64}}, Vector{AbstractPPL.VarName{:β, Setfield.IdentityLens}}, Vector{Float64}, Vector{Set{DynamicPPL.Selector}}}, DynamicPPL.Metadata{Dict{AbstractPPL.VarName{:K, Setfield.IdentityLens}, Int64}, Vector{InverseGamma{Float64}}, Vector{AbstractPPL.VarName{:K, Setfield.IdentityLens}}, Vector{Float64}, Vector{Set{DynamicPPL.Selector}}}, DynamicPPL.Metadata{Dict{AbstractPPL.VarName{:var, Setfield.IdentityLens}, Int64}, Vector{InverseGamma{Float64}}, Vector{AbstractPPL.VarName{:var, Setfield.IdentityLens}}, Vector{Float64}, Vector{Set{DynamicPPL.Selector}}}}}, Float64}, DynamicPPL.Model{typeof(fit_cucker_smaile), (:data, :cucker_smaile_problem, :problem_p, :global_p), (), (), Tuple{Vector{Float64}, ODEProblem{Matrix{Float64}, Tuple{Float64, Float64}, true, Tuple{Int64, Float64, Float64}, ODEFunction{true, typeof(cuckersmale!), LinearAlgebra.UniformScaling{Bool}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, typeof(SciMLBase.DEFAULT_OBSERVED), Nothing}, Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}, SciMLBase.StandardODEProblem}, Tuple{Int64, Float64, Float64}, Tuple{RK4, Float64}}, Tuple{}, DynamicPPL.DefaultContext}, DynamicPPL.Sampler{NUTS{Turing.Essential.ForwardDiffAD{0}, (), AdvancedHMC.DiagEuclideanMetric}}, DynamicPPL.DefaultContext}, Float64}}(0.3406548813088736,0.0,0.3406548813088736,0.0), Dual{ForwardDiff.Tag{Turing.Essential.var"#f#4"{DynamicPPL.TypedVarInfo{NamedTuple{(:β, :K, :var), Tuple{DynamicPPL.Metadata{Dict{AbstractPPL.VarName{:β, Setfield.IdentityLens}, Int64}, Vector{Uniform{Float64}}, Vector{AbstractPPL.VarName{:β, Setfield.IdentityLens}}, Vector{Float64},
Vector{Set{DynamicPPL.Selector}}}, DynamicPPL.Metadata{Dict{AbstractPPL.VarName{:K, Setfield.IdentityLens}, Int64}, Vector{InverseGamma{Float64}}, Vector{AbstractPPL.VarName{:K, Setfield.IdentityLens}}, Vector{Float64}, Vector{Set{DynamicPPL.Selector}}}, DynamicPPL.Metadata{Dict{AbstractPPL.VarName{:var, Setfield.IdentityLens}, Int64}, Vector{InverseGamma{Float64}}, Vector{AbstractPPL.VarName{:var, Setfield.IdentityLens}}, Vector{Float64}, Vector{Set{DynamicPPL.Selector}}}}}, Float64}, DynamicPPL.Model{typeof(fit_cucker_smaile), (:data, :cucker_smaile_problem, :problem_p, :global_p), (), (), Tuple{Vector{Float64}, ODEProblem{Matrix{Float64}, Tuple{Float64, Float64}, true, Tuple{Int64, Float64, Float64}, ODEFunction{true, typeof(cuckersmale!), LinearAlgebra.UniformScaling{Bool}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, typeof(SciMLBase.DEFAULT_OBSERVED), Nothing}, Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}, SciMLBase.StandardODEProblem}, Tuple{Int64, Float64, Float64}, Tuple{RK4, Float64}}, Tuple{}, DynamicPPL.DefaultContext}, DynamicPPL.Sampler{NUTS{Turing.Essential.ForwardDiffAD{0}, (), AdvancedHMC.DiagEuclideanMetric}}, DynamicPPL.DefaultContext}, Float64}}(0.6351259651692638,0.2317409735370749,0.0,0.0)), Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}(), SciMLBase.StandardODEProblem())
``````

@devmotion and other Turing devs, have you considered having a fixed tag mode?

It’s probably only with the dual mode. Enable ForwardDiff NaNSafe and see what happens.

1 Like

No, but it seems like a good idea to improve the stacktraces.
I opened a PR: Improve stacktraces by using custom tag for ForwardDiff by devmotion · Pull Request #1841 · TuringLang/Turing.jl · GitHub

Yes, most likely it is caused by your `twonorm`. Without NaNSafe-mode you get

``````julia> using ForwardDiff

julia> twonorm(x) = sqrt(sum(abs2, x))
twonorm (generic function with 1 method)

5-element Vector{Float64}:
NaN
NaN
NaN
NaN
NaN
``````

With NaNsafe-mode enabled you’ll get

``````julia> ForwardDiff.gradient(twonorm, zeros(5))
5-element Vector{Float64}:
0.0
0.0
0.0
0.0
0.0
``````

An alternative would be to use just `LinearAlgebra.norm`. However, both with and without NaNsafe-mode surprisingly it returns

``````julia> ForwardDiff.gradient(norm, zeros(5))
5-element Vector{Float64}:
0.0
0.0
0.0
0.0
1.0
``````

See derivative of `norm` at 0 · Issue #243 · JuliaDiff/ForwardDiff.jl · GitHub and related issues.

As yet another alternative, probably it would be sufficient to add a check for `i == j || (xj = SVector(u[1, j], u[2, j])) == xi` in the inner for loop in `cuckersmale!` since the contribution from these cases where `xj == xi` (which might occur even if `i != j`) to `totx` and `toty` is zero anyway, it seems.

2 Likes

On a second thought, that would only work if `xi == xj` would imply `vi == vj` (which is not guaranteed in general, I assume?). Generally, if `xi == xj`, the contribution to `totx` and `toty` would be `K * (vj[1] - vi[1])` and `K * (vj[2] - vi[2])`, which might be nonzero if `vi != vj` (which, of course, is only possible if `i != j`).

So probably the safest and easiest option is to enable NaNsafe-mode. Skipping the cases `i != j` would still be reasonable, IMO, but it seems it would not ensure that you don’t run into these gradient issues.

1 Like

Thanks for the suggestions, both. The `nansafe_mode` option solved the problem. Thanks also for opening that PR. Sometimes I find the stacktraces produced when combining Turing and DiffEq to be so long that they don’t even fit in my REPL and I can’t access the first part to see where the problem occurred.