Why Turing.jl does not work with NUTS() and HMC() samplers?

I am experimenting with the Turing.jl package. I am trying to run the example model below but I get an error messagge when I try to use the NUTS() or the HMC() samplers. The error only occurs when I try to slice the array t[tb:end] inside the model. Note that the model works fine with the MH() sampler.

Would anyone be able to help me understand why the other samplers do not work and how do I fix this model? The exact error message that I get is shown below.

I am using Turing.jl version v0.29.3.

MWE here:

using Turing

@model function ModelTuring(y, N::Int, t)

    R=zeros(length(t))
    α ~ Distributions.Uniform(0, 1)

    tb=1
    ## This works fine with all samplers
    #R = α .+ t

    ## This only works with the MH() sampler only
    R[tb:end] = α .+ t[tb:end]

    for i=1:N
        y[i] ~ Poisson(R[i])
    end
    
end ##

N=10
t = collect(range(0; length=N, step=1))
Rdata= 0.5 .* t

model = ModelTuring(Rdata, N, t)
#sampler = NUTS()
sampler=HMC(0.1,10)
#sampler=MH()

nSamples=5
chain = sample(model, sampler, nSamples; init_params=[0.5])

Here is the first 50 lines of the error message:

ERROR: LoadError: MethodError: no method matching Float64(::ForwardDiff.Dual{ForwardDiff.Tag{Turing.TuringTag, Float64}, Float64, 1})

Closest candidates are:
  (::Type{T})(::Real, !Matched::RoundingMode) where T<:AbstractFloat
   @ Base rounding.jl:207
  (::Type{T})(::T) where T<:Number
   @ Core boot.jl:792
  (::Type{T})(!Matched::AbstractChar) where T<:Union{AbstractChar, Number}
   @ Base char.jl:50
  ...

Stacktrace:
  [1] convert(#unused#::Type{Float64}, x::ForwardDiff.Dual{ForwardDiff.Tag{Turing.TuringTag, Float64}, Float64, 1})
    @ Base ./number.jl:7
  [2] setindex!(A::Vector{Float64}, x::ForwardDiff.Dual{ForwardDiff.Tag{Turing.TuringTag, Float64}, Float64, 1}, i1::Int64)
    @ Base ./array.jl:969
  [3] setindex!
    @ ./array.jl:983 [inlined]
  [4] ModelTuring(__model__::DynamicPPL.Model{typeof(ModelTuring), (:y, :N, :t), (), (), Tuple{Vector{Float64}, Int64, Vector{Int64}}, Tuple{}, DynamicPPL.DefaultContext}, __varinfo__::DynamicPPL.ThreadSafeVarInfo{DynamicPPL.TypedVarInfo{NamedTuple{(:α,), Tuple{DynamicPPL.Metadata{Dict{AbstractPPL.VarName{:α, Setfield.IdentityLens}, Int64}, Vector{Uniform{Float64}}, Vector{AbstractPPL.VarName{:α, Setfield.IdentityLens}}, Vector{ForwardDiff.Dual{ForwardDiff.Tag{Turing.TuringTag, Float64}, Float64, 1}}, Vector{Set{DynamicPPL.Selector}}}}}, ForwardDiff.Dual{ForwardDiff.Tag{Turing.TuringTag, Float64}, Float64, 1}}, Vector{Base.RefValue{ForwardDiff.Dual{ForwardDiff.Tag{Turing.TuringTag, Float64}, Float64, 1}}}}, __context__::DynamicPPL.SamplingContext{DynamicPPL.Sampler{HMC{Turing.Essential.ForwardDiffAD{0}, (), AdvancedHMC.UnitEuclideanMetric}}, DynamicPPL.DefaultContext, Random.TaskLocalRNG}, y::Vector{Float64}, N::Int64, t::Vector{Int64})
    @ Main /lfs/nobackuplfs03/xom/seismicity/joadsjr/Documents/Work/Projects/Gorgon/ExtendedModelDomain/Abaqus/model_22/Julia/trash_24.jl:18
  [5] _evaluate!!(model::DynamicPPL.Model{typeof(ModelTuring), (:y, :N, :t), (), (), Tuple{Vector{Float64}, Int64, Vector{Int64}}, Tuple{}, DynamicPPL.DefaultContext}, varinfo::DynamicPPL.ThreadSafeVarInfo{DynamicPPL.TypedVarInfo{NamedTuple{(:α,), Tuple{DynamicPPL.Metadata{Dict{AbstractPPL.VarName{:α, Setfield.IdentityLens}, Int64}, Vector{Uniform{Float64}}, Vector{AbstractPPL.VarName{:α, Setfield.IdentityLens}}, Vector{ForwardDiff.Dual{ForwardDiff.Tag{Turing.TuringTag, Float64}, Float64, 1}}, Vector{Set{DynamicPPL.Selector}}}}}, ForwardDiff.Dual{ForwardDiff.Tag{Turing.TuringTag, Float64}, Float64, 1}}, Vector{Base.RefValue{ForwardDiff.Dual{ForwardDiff.Tag{Turing.TuringTag, Float64}, Float64, 1}}}}, context::DynamicPPL.SamplingContext{DynamicPPL.Sampler{HMC{Turing.Essential.ForwardDiffAD{0}, (), AdvancedHMC.UnitEuclideanMetric}}, DynamicPPL.DefaultContext, Random.TaskLocalRNG})
    @ DynamicPPL ~/.julia/packages/DynamicPPL/oX6N7/src/model.jl:963
  [6] evaluate_threadsafe!!(model::DynamicPPL.Model{typeof(ModelTuring), (:y, :N, :t), (), (), Tuple{Vector{Float64}, Int64, Vector{Int64}}, Tuple{}, DynamicPPL.DefaultContext}, varinfo::DynamicPPL.TypedVarInfo{NamedTuple{(:α,), Tuple{DynamicPPL.Metadata{Dict{AbstractPPL.VarName{:α, Setfield.IdentityLens}, Int64}, Vector{Uniform{Float64}}, Vector{AbstractPPL.VarName{:α, Setfield.IdentityLens}}, Vector{ForwardDiff.Dual{ForwardDiff.Tag{Turing.TuringTag, Float64}, Float64, 1}}, Vector{Set{DynamicPPL.Selector}}}}}, ForwardDiff.Dual{ForwardDiff.Tag{Turing.TuringTag, Float64}, Float64, 1}}, context::DynamicPPL.SamplingContext{DynamicPPL.Sampler{HMC{Turing.Essential.ForwardDiffAD{0}, (), AdvancedHMC.UnitEuclideanMetric}}, DynamicPPL.DefaultContext, Random.TaskLocalRNG})
    @ DynamicPPL ~/.julia/packages/DynamicPPL/oX6N7/src/model.jl:952
  [7] evaluate!!
    @ ~/.julia/packages/DynamicPPL/oX6N7/src/model.jl:887 [inlined]
  [8] logdensity(f::LogDensityFunction{DynamicPPL.TypedVarInfo{NamedTuple{(:α,), Tuple{DynamicPPL.Metadata{Dict{AbstractPPL.VarName{:α, Setfield.IdentityLens}, Int64}, Vector{Uniform{Float64}}, Vector{AbstractPPL.VarName{:α, Setfield.IdentityLens}}, Vector{Float64}, Vector{Set{DynamicPPL.Selector}}}}}, Float64}, DynamicPPL.Model{typeof(ModelTuring), (:y, :N, :t), (), (), Tuple{Vector{Float64}, Int64, Vector{Int64}}, Tuple{}, DynamicPPL.DefaultContext}, DynamicPPL.SamplingContext{DynamicPPL.Sampler{HMC{Turing.Essential.ForwardDiffAD{0}, (), AdvancedHMC.UnitEuclideanMetric}}, DynamicPPL.DefaultContext, Random.TaskLocalRNG}}, θ::Vector{ForwardDiff.Dual{ForwardDiff.Tag{Turing.TuringTag, Float64}, Float64, 1}})

This is common issue I’ve stumbled on when using ForwardDiff.jl. From what I can tell, when a ForwardDiff.Dual is added/subtracted/multiplied with an indexed vector of Float64, there is an attempted conversion to allow for the operation (Conversion and Promotion · The Julia Language). ForwardDiff deals with this gracefully when you use R = alpha .+ t but as soon as indexing is involved, it struggles. Someone with a better understanding than me will have to explain why that is. (I’m guessing it has to do with making temporary copies of the arrays during the operation).

My usual course of action is to try to remove indexed arrays in my code or to ensure that they take on the ForwardDiff.Dual element type. So, for example, in your code t[tb:end] and R[tb:end] have elements of types Int64 and Float64 respectively. If you do the following:

using Turing

@model function ModelTuring(y, N::Int, t)

    α ~ Distributions.Uniform(0, 1)

    #########################################################
    # Ensure arrays have the proper type for ForwardDiff.Dual
    R = zeros(eltype(α), length(t))
    t = convert(Array{eltype(α)}, t)
    #########################################################

    tb=1
    ## This works fine with all samplers
    # R = α .+ t

    ## This only works with the MH() sampler only
    R[tb:end] = α .+ t[tb:end]

    for i=1:N
        y[i] ~ Poisson(R[i])
    end

end ##

you can ensure that the elements of the arrays are correctly typed. There may be a more elegant/performant way, but I haven’t found it yet.

3 Likes

Thank you very much for the help @cnrrobertson , you suggestion solved my problem.