Imputing Discrete Data

ParadaCarleton · April 6, 2021, 3:24am

I’m trying to use NUTS in Turing.jl to estimate some parameters, but have some missing data which follows a binomial distribution. Since I know NUTS can’t impute discrete variables so I’m trying to use Gibbs with PG to sample the missing data, but I assume I’m doing this wrong since this seems to cause the REPL to crash:

    
    # Data
    dataSize = nrow(dataset)
    n = 550 # Observed 550 people to see if they were wearing masks
    villages = dataset[:vilIndex]
    maskCounts::Vector{Int64} = dataset[:maskCount]

    # Set hyperpriors
    μ ~ Normal(-4.5, .5) # Weakly informative prior: ~100% of people in rural Cameroon didn't wear masks in August (from correspondents), claim very certain.
    # 3σ credible interval is "Between 0.1% and 5% of people use masks"
    τ ~ Gamma(4, 1 / 4 / .3^2)
    σ = τ^-.5

    timeμ ~ Normal(1, 1) # Weakly informative prior: Mask use will increase to ~ 3% (170% increase), 3-sigma credible interval up to 18%
    timeτ ~ Gamma(4, 1 / 4 / .2^2) # Weakly informative prior -- standard deviation in random variation is about 20%
    timeσ = timeτ^-.5

    effectμ ~ Normal(0, 1) # Regularizing prior. Assumes effect size will shrink, but is very wide
    effectτ ~ Gamma(4, 1 / 4 / .2^2) # Weakly informative prior -- standard deviation in effect size is about 20%
    effectσ = effectτ^-.5

    estimateBiasμ ~ Normal(.5, 1) # People usually overestimate small numbers
    estimateBiasτ ~ Gamma(4, 1 / 4 / .2^2)
    estimateBiasσ = effectτ^-.5
    estimateτ ~ Gamma(4, 1 / 4 / .2^2)
    estimateσ = estimateτ^-.5

    campaignBiasμ ~ Normal(.5, 1) # Campaign will probably make people overestimate mask wearing more 
    campaignBiasτ ~ Gamma(4, 1 / 4 / .2^2)
    campaignBiasσ = campaignBiasτ^-.5


    constant ~ filldist(NormalCanon(μ * τ, τ), numVillages)
    randomChange ~ filldist(NormalCanon(timeμ * timeτ, timeτ), numVillages)
    campaignEffect ~ filldist(NormalCanon(effectμ * effectτ, effectτ), numVillages)

    estimateBias ~ filldist(NormalCanon(estimateBiasμ * estimateBiasτ, estimateBiasτ), numVillages)
    campaignBias ~ filldist(NormalCanon(campaignBiasμ * campaignBiasτ, campaignBiasτ), numVillages)


    # Add variables on logit scale to make prediction
    prediction = begin
        constant[villages] .+ 
        randomChange[villages] .* dataset[:time] .+ 
        campaignEffect[villages] .* dataset[:followup] .* dataset[:campaign]
    end

    # Likelihood functions
    @. dataset[:maskCount] ~ BinomialLogit(n, prediction)
    @. dataset[:wearsMaskEst] ~ LogitNormal(prediction .+ estimateBias[villages] .+ campaignBias[villages] .* dataset[:followup], estimateσ)
    
    return μ, σ, timeμ, timeσ, effectμ, effectσ, estimateBiasμ, estimateBiasσ, estimateσ, campaignBiasμ, campaignBiasσ

end


##
chains = sample(maskingModel(dataset), Gibbs(PG(100, :maskCount), NUTS()), MCMCThreads(), 50_000, Threads.nthreads())

While on this subject: Is there any reason to prefer one sampler over another in the case of discrete variables? I’m under the impression that HMC/NUTS is the fastest sampler for continuous variables, but otherwise I’m not sure which are best.

cpfiffer · April 6, 2021, 3:47am

What was the error?

ParadaCarleton · April 6, 2021, 4:09am

As mentioned, I don’t get an error, the REPL just crashes; I think right before it did I saw something about a segmentation fault, though.

baggepinnen · April 6, 2021, 5:32am

I assume you are running vscode where the REPL disappears when crashing? If you run the same in a normal terminal, you might be able to reproduce the crash message?

ParadaCarleton · April 6, 2021, 7:42am

I managed to fix the earlier problem, but now get a new error when I try to run this:

ERROR: LoadError: TaskFailedException

    nested task error: TaskFailedException
    Stacktrace:
     [1] wait
       @ ./task.jl:317 [inlined]
     [2] threading_run(func::Function)
       @ Base.Threads ./threadingconstructs.jl:34
     [3] macro expansion
       @ ./threadingconstructs.jl:93 [inlined]
     [4] macro expansion
       @ ~/.julia/packages/AbstractMCMC/oou1a/src/sample.jl:304 [inlined]
     [5] (::AbstractMCMC.var"#30#40"{Bool, Base.Iterators.Pairs{Symbol, UnionAll, Tuple{Symbol}, NamedTuple{(:chain_type,), Tuple{UnionAll}}}, Int64, Int64, Vector{Any}, Vector{UInt64}, Vector{DynamicPPL.Sampler{Gibbs{(:maskCount,), Tuple{PG{(:maskCount,), Turing.Core.ResampleWithESSThreshold{typeof(Turing.Inference.resample_systematic), Float64}}, NUTS{Turing.Core.ReverseDiffAD{true}, (), AdvancedHMC.DiagEuclideanMetric}}}}}, Vector{DynamicPPL.Model{var"#25#26", (:dataset,), (), (), Tuple{DataFrame}, Tuple{}}}, Vector{Random._GLOBAL_RNG}})()
       @ AbstractMCMC ./task.jl:406
    
        nested task error: MethodError: no method matching loglikelihood(::BinomialLogit{Float64, Float64}, ::Missing)
        Closest candidates are:
          loglikelihood(::Turing.Inference.ESSModel, ::Any) at /home/lime/.julia/packages/Turing/uAz5c/src/inference/ess.jl:122
          loglikelihood(::EllipticalSliceSampling.ESSModel, ::Any) at /home/lime/.julia/packages/EllipticalSliceSampling/SxcGW/src/model.jl:42
          loglikelihood(::UnivariateDistribution{S} where S<:ValueSupport, ::Real) at /home/lime/.julia/packages/Distributions/jFoHB/src/univariates.jl:535
          ...
        Stacktrace:
          [1] (::DynamicPPL.var"#96#97")(::Tuple{BinomialLogit{Float64, Float64}, Missing})
            @ DynamicPPL ~/.julia/packages/DynamicPPL/RRHxZ/src/context_implementations.jl:472
          [2] MappingRF
            @ ./reduce.jl:93 [inlined]
          [3] _foldl_impl(op::Base.MappingRF{DynamicPPL.var"#96#97", Base.BottomRF{typeof(Base.add_sum)}}, init::Base._InitialValue, itr::Base.Iterators.Zip{Tuple{Vector{BinomialLogit{Float64, Float64}}, Vector{Union{Missing, Int64}}}})
            @ Base ./reduce.jl:62
          [4] foldl_impl
            @ ./reduce.jl:48 [inlined]
          [5] mapfoldl_impl
            @ ./reduce.jl:44 [inlined]
          [6] #mapfoldl#214
            @ ./reduce.jl:160 [inlined]
          [7] mapfoldl
            @ ./reduce.jl:160 [inlined]
          [8] #mapreduce#218
            @ ./reduce.jl:287 [inlined]
          [9] mapreduce
            @ ./reduce.jl:287 [inlined]
         [10] #sum#221
            @ ./reduce.jl:501 [inlined]
         [11] sum
            @ ./reduce.jl:501 [inlined]
         [12] dot_observe(spl::DynamicPPL.SampleFromPrior, dists::Vector{BinomialLogit{Float64, Float64}}, value::Vector{Union{Missing, Int64}}, vi::DynamicPPL.ThreadSafeVarInfo{DynamicPPL.UntypedVarInfo{DynamicPPL.Metadata{Dict{AbstractPPL.VarName, Int64}, Vector{Distribution}, Vector{AbstractPPL.VarName}, Vector{Real}, Vector{Set{DynamicPPL.Selector}}}, Float64}, Vector{Base.RefValue{Float64}}})
            @ DynamicPPL ~/.julia/packages/DynamicPPL/RRHxZ/src/context_implementations.jl:471
         [13] _dot_tilde
            @ ~/.julia/packages/DynamicPPL/RRHxZ/src/context_implementations.jl:428 [inlined]
         [14] dot_tilde
            @ ~/.julia/packages/DynamicPPL/RRHxZ/src/context_implementations.jl:386 [inlined]
         [15] dot_tilde_observe(ctx::DynamicPPL.DefaultContext, sampler::DynamicPPL.SampleFromPrior, right::Vector{BinomialLogit{Float64, Float64}}, left::Vector{Union{Missing, Int64}}, vn::AbstractPPL.VarName{:dataset, Tuple{Tuple{Symbol}}}, inds::Tuple{Tuple{Symbol}}, vi::DynamicPPL.ThreadSafeVarInfo{DynamicPPL.UntypedVarInfo{DynamicPPL.Metadata{Dict{AbstractPPL.VarName, Int64}, Vector{Distribution}, Vector{AbstractPPL.VarName}, Vector{Real}, Vector{Set{DynamicPPL.Selector}}}, Float64}, Vector{Base.RefValue{Float64}}})
            @ DynamicPPL ~/.julia/packages/DynamicPPL/RRHxZ/src/context_implementations.jl:408
         [16] #25
            @ ~/Documents/healthModel2020/yeet.jl:81 [inlined]
         [17] (::var"#25#26")(_rng::Random._GLOBAL_RNG, _model::DynamicPPL.Model{var"#25#26", (:dataset,), (), (), Tuple{DataFrame}, Tuple{}}, _varinfo::DynamicPPL.ThreadSafeVarInfo{DynamicPPL.UntypedVarInfo{DynamicPPL.Metadata{Dict{AbstractPPL.VarName, Int64}, Vector{Distribution}, Vector{AbstractPPL.VarName}, Vector{Real}, Vector{Set{DynamicPPL.Selector}}}, Float64}, Vector{Base.RefValue{Float64}}}, _sampler::DynamicPPL.SampleFromPrior, _context::DynamicPPL.DefaultContext, dataset::DataFrame)
            @ Main ./none:0
         [18] macro expansion
            @ ~/.julia/packages/DynamicPPL/RRHxZ/src/model.jl:0 [inlined]
         [19] _evaluate
            @ ~/.julia/packages/DynamicPPL/RRHxZ/src/model.jl:154 [inlined]
         [20] evaluate_threadsafe
            @ ~/.julia/packages/DynamicPPL/RRHxZ/src/model.jl:144 [inlined]
         [21] Model
            @ ~/.julia/packages/DynamicPPL/RRHxZ/src/model.jl:94 [inlined]
         [22] DynamicPPL.VarInfo(rng::Random._GLOBAL_RNG, model::DynamicPPL.Model{var"#25#26", (:dataset,), (), (), Tuple{DataFrame}, Tuple{}}, sampler::DynamicPPL.SampleFromPrior, context::DynamicPPL.DefaultContext)
            @ DynamicPPL ~/.julia/packages/DynamicPPL/RRHxZ/src/varinfo.jl:126
         [23] VarInfo
            @ ~/.julia/packages/DynamicPPL/RRHxZ/src/varinfo.jl:125 [inlined]
         [24] step(rng::Random._GLOBAL_RNG, model::DynamicPPL.Model{var"#25#26", (:dataset,), (), (), Tuple{DataFrame}, Tuple{}}, spl::DynamicPPL.Sampler{Gibbs{(:maskCount,), Tuple{PG{(:maskCount,), Turing.Core.ResampleWithESSThreshold{typeof(Turing.Inference.resample_systematic), Float64}}, NUTS{Turing.Core.ReverseDiffAD{true}, (), AdvancedHMC.DiagEuclideanMetric}}}}; resume_from::Nothing, kwargs::Base.Iterators.Pairs{Union{}, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
            @ DynamicPPL ~/.julia/packages/DynamicPPL/RRHxZ/src/sampler.jl:73
         [25] step
            @ ~/.julia/packages/DynamicPPL/RRHxZ/src/sampler.jl:66 [inlined]
         [26] macro expansion
            @ ~/.julia/packages/AbstractMCMC/oou1a/src/sample.jl:97 [inlined]
         [27] macro expansion
            @ ~/.julia/packages/AbstractMCMC/oou1a/src/logging.jl:15 [inlined]
         [28] mcmcsample(rng::Random._GLOBAL_RNG, model::DynamicPPL.Model{var"#25#26", (:dataset,), (), (), Tuple{DataFrame}, Tuple{}}, sampler::DynamicPPL.Sampler{Gibbs{(:maskCount,), Tuple{PG{(:maskCount,), Turing.Core.ResampleWithESSThreshold{typeof(Turing.Inference.resample_systematic), Float64}}, NUTS{Turing.Core.ReverseDiffAD{true}, (), AdvancedHMC.DiagEuclideanMetric}}}}, N::Int64; progress::Bool, progressname::String, callback::Nothing, discard_initial::Int64, thinning::Int64, chain_type::Type, kwargs::Base.Iterators.Pairs{Union{}, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
            @ AbstractMCMC ~/.julia/packages/AbstractMCMC/oou1a/src/sample.jl:88
         [29] #sample#3
            @ ~/.julia/packages/Turing/uAz5c/src/inference/Inference.jl:156 [inlined]
         [30] macro expansion
            @ ~/.julia/packages/AbstractMCMC/oou1a/src/sample.jl:313 [inlined]
         [31] (::AbstractMCMC.var"#887#threadsfor_fun#41"{UnitRange{Int64}, Bool, Base.Iterators.Pairs{Symbol, UnionAll, Tuple{Symbol}, NamedTuple{(:chain_type,), Tuple{UnionAll}}}, Int64, Vector{Any}, Vector{UInt64}, Vector{DynamicPPL.Sampler{Gibbs{(:maskCount,), Tuple{PG{(:maskCount,), Turing.Core.ResampleWithESSThreshold{typeof(Turing.Inference.resample_systematic), Float64}}, NUTS{Turing.Core.ReverseDiffAD{true}, (), AdvancedHMC.DiagEuclideanMetric}}}}}, Vector{DynamicPPL.Model{var"#25#26", (:dataset,), (), (), Tuple{DataFrame}, Tuple{}}}, Vector{Random._GLOBAL_RNG}})(onethread::Bool)
            @ AbstractMCMC ./threadingconstructs.jl:81
         [32] (::AbstractMCMC.var"#887#threadsfor_fun#41"{UnitRange{Int64}, Bool, Base.Iterators.Pairs{Symbol, UnionAll, Tuple{Symbol}, NamedTuple{(:chain_type,), Tuple{UnionAll}}}, Int64, Vector{Any}, Vector{UInt64}, Vector{DynamicPPL.Sampler{Gibbs{(:maskCount,), Tuple{PG{(:maskCount,), Turing.Core.ResampleWithESSThreshold{typeof(Turing.Inference.resample_systematic), Float64}}, NUTS{Turing.Core.ReverseDiffAD{true}, (), AdvancedHMC.DiagEuclideanMetric}}}}}, Vector{DynamicPPL.Model{var"#25#26", (:dataset,), (), (), Tuple{DataFrame}, Tuple{}}}, Vector{Random._GLOBAL_RNG}})()
            @ AbstractMCMC ./threadingconstructs.jl:48
Stacktrace:
  [1] sync_end(c::Channel{Any})
    @ Base ./task.jl:364
  [2] macro expansion
    @ ./task.jl:383 [inlined]
  [3] macro expansion
    @ ~/.julia/packages/AbstractMCMC/oou1a/src/sample.jl:282 [inlined]
  [4] macro expansion
    @ ~/.julia/packages/ProgressLogging/6KXlp/src/ProgressLogging.jl:328 [inlined]
  [5] macro expansion
    @ ~/.julia/packages/AbstractMCMC/oou1a/src/logging.jl:8 [inlined]
  [6] mcmcsample(rng::Random._GLOBAL_RNG, model::DynamicPPL.Model{var"#25#26", (:dataset,), (), (), Tuple{DataFrame}, Tuple{}}, sampler::DynamicPPL.Sampler{Gibbs{(:maskCount,), Tuple{PG{(:maskCount,), Turing.Core.ResampleWithESSThreshold{typeof(Turing.Inference.resample_systematic), Float64}}, NUTS{Turing.Core.ReverseDiffAD{true}, (), AdvancedHMC.DiagEuclideanMetric}}}}, ::MCMCThreads, N::Int64, nchains::Int64; progress::Bool, progressname::String, kwargs::Base.Iterators.Pairs{Symbol, UnionAll, Tuple{Symbol}, NamedTuple{(:chain_type,), Tuple{UnionAll}}})
    @ AbstractMCMC ~/.julia/packages/AbstractMCMC/oou1a/src/sample.jl:276
  [7] sample(rng::Random._GLOBAL_RNG, model::DynamicPPL.Model{var"#25#26", (:dataset,), (), (), Tuple{DataFrame}, Tuple{}}, sampler::DynamicPPL.Sampler{Gibbs{(:maskCount,), Tuple{PG{(:maskCount,), Turing.Core.ResampleWithESSThreshold{typeof(Turing.Inference.resample_systematic), Float64}}, NUTS{Turing.Core.ReverseDiffAD{true}, (), AdvancedHMC.DiagEuclideanMetric}}}}, parallel::MCMCThreads, N::Int64, n_chains::Int64; chain_type::Type, progress::Bool, kwargs::Base.Iterators.Pairs{Union{}, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
    @ Turing.Inference ~/.julia/packages/Turing/uAz5c/src/inference/Inference.jl:217
  [8] sample
    @ ~/.julia/packages/Turing/uAz5c/src/inference/Inference.jl:217 [inlined]
  [9] #sample#6
    @ ~/.julia/packages/Turing/uAz5c/src/inference/Inference.jl:202 [inlined]
 [10] sample
    @ ~/.julia/packages/Turing/uAz5c/src/inference/Inference.jl:202 [inlined]
 [11] #sample#5
    @ ~/.julia/packages/Turing/uAz5c/src/inference/Inference.jl:189 [inlined]
 [12] sample(model::DynamicPPL.Model{var"#25#26", (:dataset,), (), (), Tuple{DataFrame}, Tuple{}}, alg::Gibbs{(:maskCount,), Tuple{PG{(:maskCount,), Turing.Core.ResampleWithESSThreshold{typeof(Turing.Inference.resample_systematic), Float64}}, NUTS{Turing.Core.ReverseDiffAD{true}, (), AdvancedHMC.DiagEuclideanMetric}}}, parallel::MCMCThreads, N::Int64, n_chains::Int64)
    @ Turing.Inference ~/.julia/packages/Turing/uAz5c/src/inference/Inference.jl:189
 [13] top-level scope
    @ ~/Documents/healthModel2020/yeet.jl:90
in expression starting at /home/lime/Documents/healthModel2020/yeet.jl:90

So I’m guessing I’m making a mistake with regards to how Turing handles missing data imputation.

cpfiffer · April 6, 2021, 3:23pm

Yeah, I think we’re just not catching the BinomialLogit with missing values correctly. Can you open an issue here?

ParadaCarleton · April 6, 2021, 4:47pm

I don’t think that’s the problem – the binomial from Distributions.jl has the same problem.

cpfiffer · April 6, 2021, 5:06pm

No, it’s BinomialLogit, which Turing defines: Turing.jl/distributions.jl at master · TuringLang/Turing.jl · GitHub

cpfiffer · April 6, 2021, 5:07pm

Oh, I understand your comment.

cpfiffer · April 6, 2021, 5:08pm

Hm, I think this is maybe because the broadcasting operation doesn’t create a new variable. Can you try replacing the likelihood stuff with a for loop?

ParadaCarleton · April 6, 2021, 6:52pm

Trying to replace the likelihood things with a for loop results in the segmentation fault coming back, and I also expect the code to be extremely slow without vectorization. Is there a way to create a new variable while keeping the vectorized form?

cpfiffer · April 6, 2021, 7:33pm

The segfault might simply be due to the fact that the DataFrames column holding your data is not a task-local data structure, so when it might normally attempt to create a new variable, the particle Gibbs sampler cannot correctly perform inference. Essentially, the fix here is to create a TArray holding the data you want to impute. The non-missing values will be used for the likelihood and the missing values will be targeted for imputation. More info here.

As to the creating a new variable in the broadcasting format, no, the model compiler currently doesn’t know what to do with that.

ParadaCarleton · April 6, 2021, 8:35pm

I tried switching to TArrays, but this doesn’t seem to fix the problem:

@model maskingModel(dataset) = begin
    
    # Data
    dataSize = nrow(dataset)
    n = 550 # Observed 550 people to see if they were wearing masks
    villages = dataset[:vilIndex]
    maskCounts = tzeros(Union{UInt16, Missing}, dataSize)
    maskEsts = tzeros(Float64, dataSize)
    for i in 1:dataSize
        maskCounts[i] = dataset[:maskCount][i]
        maskEsts[i] = dataset[:wearsMaskEst][i]
    end

    
    # Set hyperpriors
    μ ~ Normal(-4.5, .5) # Weakly informative prior: ~100% of people in rural Cameroon didn't wear masks in August (from correspondents), claim very certain.
    # 3σ credible interval is "Between 0.1% and 5% of people use masks"
    τ ~ Gamma(4, 1 / 4 / .3^2)
    σ = τ^-.5

    timeμ ~ Normal(1, 1) # Weakly informative prior: Mask use will increase to ~ 3% (170% increase), 3-sigma credible interval up to 18%
    timeτ ~ Gamma(4, 1 / 4 / .2^2) # Weakly informative prior -- standard deviation in random variation is about 20%
    timeσ = timeτ^-.5

    effectμ ~ Normal(0, 1) # Regularizing prior. Assumes effect size will shrink, but is very wide
    effectτ ~ Gamma(4, 1 / 4 / .2^2) # Weakly informative prior -- standard deviation in effect size is about 20%
    effectσ = effectτ^-.5

    estimateBiasμ ~ Normal(.5, 1) # People usually overestimate small numbers
    estimateBiasτ ~ Gamma(4, 1 / 4 / .2^2)
    estimateBiasσ = effectτ^-.5
    estimateτ ~ Gamma(4, 1 / 4 / .2^2)
    estimateσ = estimateτ^-.5

    campaignBiasμ ~ Normal(.5, 1) # Campaign will probably make people overestimate mask wearing more 
    campaignBiasτ ~ Gamma(4, 1 / 4 / .2^2)
    campaignBiasσ = campaignBiasτ^-.5


    constant ~ filldist(NormalCanon(μ * τ, τ), numVillages)
    randomChange ~ filldist(NormalCanon(timeμ * timeτ, timeτ), numVillages)
    campaignEffect ~ filldist(NormalCanon(effectμ * effectτ, effectτ), numVillages)

    estimateBias ~ filldist(NormalCanon(estimateBiasμ * estimateBiasτ, estimateBiasτ), numVillages)
    campaignBias ~ filldist(NormalCanon(campaignBiasμ * campaignBiasτ, campaignBiasτ), numVillages)


    # Add variables on logit scale to make prediction
    prediction = @. begin
        constant[villages] + 
        randomChange[villages] * dataset[:time] + 
        campaignEffect[villages] * dataset[:followup] * dataset[:campaign]
    end

    # Likelihood functions
    @. maskCounts ~ BinomialLogit(n, prediction)
    @. maskEsts ~ LogitNormal(prediction + estimateBias[villages] + campaignBias[villages] * dataset[:followup] * dataset[:campaign], estimateσ)
    
    return μ, σ, timeμ, timeσ, effectμ, effectσ, estimateBiasμ, estimateBiasσ, estimateσ, campaignBiasμ, campaignBiasσ

end

I get the following error:

julia> chains = sample(maskingModel(dataset), Gibbs(PG(10, :maskCounts), NUTS()), MCMCThreads(), 1000, Threads.nthreads())
       

signal (11): Segmentation fault
in expression starting at /home/lime/Documents/healthModel2020/yeet.jl:97
gc_read_stack at /buildworker/worker/package_linux64/build/src/gc.c:1639 [inlined]
gc_mark_loop at /buildworker/worker/package_linux64/build/src/gc.c:2675
_jl_gc_collect at /buildworker/worker/package_linux64/build/src/gc.c:3034
jl_gc_collect at /buildworker/worker/package_linux64/build/src/gc.c:3241
maybe_collect at /buildworker/worker/package_linux64/build/src/gc.c:880 [inlined]
jl_gc_pool_alloc at /buildworker/worker/package_linux64/build/src/gc.c:1204
jl_gc_alloc_ at /buildworker/worker/package_linux64/build/src/julia_internal.h:285 [inlined]
jl_new_typevar at /buildworker/worker/package_linux64/build/src/builtins.c:1118
rename_unionall at /buildworker/worker/package_linux64/build/src/subtype.c:466 [inlined]
unalias_unionall at /buildworker/worker/package_linux64/build/src/subtype.c:808
subtype_unionall at /buildworker/worker/package_linux64/build/src/subtype.c:836 [inlined]
subtype at /buildworker/worker/package_linux64/build/src/subtype.c:1285
with_tvar at /buildworker/worker/package_linux64/build/src/subtype.c:714
subtype_unionall at /buildworker/worker/package_linux64/build/src/subtype.c:839 [inlined]
subtype at /buildworker/worker/package_linux64/build/src/subtype.c:1282
with_tvar at /buildworker/worker/package_linux64/build/src/subtype.c:714
subtype_unionall at /buildworker/worker/package_linux64/build/src/subtype.c:839 [inlined]
subtype at /buildworker/worker/package_linux64/build/src/subtype.c:1282
with_tvar at /buildworker/worker/package_linux64/build/src/subtype.c:745
subtype_unionall at /buildworker/worker/package_linux64/build/src/subtype.c:839 [inlined]
subtype at /buildworker/worker/package_linux64/build/src/subtype.c:1282
with_tvar at /buildworker/worker/package_linux64/build/src/subtype.c:745
subtype_unionall at /buildworker/worker/package_linux64/build/src/subtype.c:839 [inlined]
subtype at /buildworker/worker/package_linux64/build/src/subtype.c:1282
with_tvar at /buildworker/worker/package_linux64/build/src/subtype.c:745
subtype_unionall at /buildworker/worker/package_linux64/build/src/subtype.c:839 [inlined]
subtype at /buildworker/worker/package_linux64/build/src/subtype.c:1282
exists_subtype at /buildworker/worker/package_linux64/build/src/subtype.c:1426 [inlined]
forall_exists_subtype at /buildworker/worker/package_linux64/build/src/subtype.c:1454
jl_subtype_env at /buildworker/worker/package_linux64/build/src/subtype.c:1888
intersect at /buildworker/worker/package_linux64/build/src/subtype.c:3022
intersect_tuple at /buildworker/worker/package_linux64/build/src/subtype.c:2717 [inlined]
intersect at /buildworker/worker/package_linux64/build/src/subtype.c:3085
intersect_all at /buildworker/worker/package_linux64/build/src/subtype.c:3166
intersect_types at /buildworker/worker/package_linux64/build/src/subtype.c:3237
jl_has_empty_intersection at /buildworker/worker/package_linux64/build/src/subtype.c:3248
ml_matches at /buildworker/worker/package_linux64/build/src/gf.c:2866
jl_matching_methods at /buildworker/worker/package_linux64/build/src/gf.c:1885
_methods_by_ftype at ./reflection.jl:867 [inlined]
#findall#216 at ./compiler/methodtable.jl:57 [inlined]
findall##kw at ./compiler/methodtable.jl:54
jfptr_findallYY.YY.kw_13165.clone_1 at /usr/lib/julia/sys.so (unknown line)
_jl_invoke at /buildworker/worker/package_linux64/build/src/gf.c:2237 [inlined]
jl_apply_generic at /buildworker/worker/package_linux64/build/src/gf.c:2419
#218 at ./compiler/methodtable.jl:67
get! at ./iddict.jl:163 [inlined]
#findall#217 at ./compiler/methodtable.jl:66 [inlined]
findall##kw at ./compiler/methodtable.jl:65
jfptr_findallYY.YY.kw_13160.clone_1 at /usr/lib/julia/sys.so (unknown line)
_jl_invoke at /buildworker/worker/package_linux64/build/src/gf.c:2237 [inlined]
jl_apply_generic at /buildworker/worker/package_linux64/build/src/gf.c:2419
abstract_call_gf_by_type at ./compiler/abstractinterpretation.jl:84
abstract_call_known at ./compiler/abstractinterpretation.jl:1033
abstract_call at ./compiler/abstractinterpretation.jl:1056
abstract_call at ./compiler/abstractinterpretation.jl:1040
abstract_eval_statement at ./compiler/abstractinterpretation.jl:1167
typeinf_local at ./compiler/abstractinterpretation.jl:1447
typeinf_nocycle at ./compiler/abstractinterpretation.jl:1520
_typeinf at ./compiler/typeinfer.jl:214
typeinf at ./compiler/typeinfer.jl:209
typeinf_edge at ./compiler/typeinfer.jl:806
abstract_call_method at ./compiler/abstractinterpretation.jl:490
abstract_call_gf_by_type at ./compiler/abstractinterpretation.jl:143
abstract_call_known at ./compiler/abstractinterpretation.jl:1033
abstract_call at ./compiler/abstractinterpretation.jl:1056
abstract_call at ./compiler/abstractinterpretation.jl:1040
abstract_eval_statement at ./compiler/abstractinterpretation.jl:1167
typeinf_local at ./compiler/abstractinterpretation.jl:1462
typeinf_nocycle at ./compiler/abstractinterpretation.jl:1520
_typeinf at ./compiler/typeinfer.jl:214
typeinf at ./compiler/typeinfer.jl:209
typeinf_edge at ./compiler/typeinfer.jl:806
abstract_call_method at ./compiler/abstractinterpretation.jl:490
abstract_call_gf_by_type at ./compiler/abstractinterpretation.jl:143
abstract_call_known at ./compiler/abstractinterpretation.jl:1033
abstract_call at ./compiler/abstractinterpretation.jl:1056
abstract_call at ./compiler/abstractinterpretation.jl:1040
abstract_eval_statement at ./compiler/abstractinterpretation.jl:1167
typeinf_local at ./compiler/abstractinterpretation.jl:1462
typeinf_nocycle at ./compiler/abstractinterpretation.jl:1520
_typeinf at ./compiler/typeinfer.jl:214
typeinf at ./compiler/typeinfer.jl:209
typeinf_edge at ./compiler/typeinfer.jl:806
abstract_call_method at ./compiler/abstractinterpretation.jl:490
abstract_call_gf_by_type at ./compiler/abstractinterpretation.jl:143
abstract_call_known at ./compiler/abstractinterpretation.jl:1033
abstract_call at ./compiler/abstractinterpretation.jl:1056
abstract_call at ./compiler/abstractinterpretation.jl:1040
abstract_eval_statement at ./compiler/abstractinterpretation.jl:1167
typeinf_local at ./compiler/abstractinterpretation.jl:1462
typeinf_nocycle at ./compiler/abstractinterpretation.jl:1520
_typeinf at ./compiler/typeinfer.jl:214
typeinf at ./compiler/typeinfer.jl:209
typeinf_edge at ./compiler/typeinfer.jl:806
abstract_call_method at ./compiler/abstractinterpretation.jl:490
abstract_call_gf_by_type at ./compiler/abstractinterpretation.jl:143
abstract_call at ./compiler/abstractinterpretation.jl:1054
abstract_call at ./compiler/abstractinterpretation.jl:1040
abstract_eval_statement at ./compiler/abstractinterpretation.jl:1167
typeinf_local at ./compiler/abstractinterpretation.jl:1462
typeinf_nocycle at ./compiler/abstractinterpretation.jl:1520
_typeinf at ./compiler/typeinfer.jl:214
typeinf at ./compiler/typeinfer.jl:209
typeinf_edge at ./compiler/typeinfer.jl:806
abstract_call_method at ./compiler/abstractinterpretation.jl:490
abstract_call_gf_by_type at ./compiler/abstractinterpretation.jl:143
abstract_call at ./compiler/abstractinterpretation.jl:1054
abstract_apply at ./compiler/abstractinterpretation.jl:738
abstract_call_known at ./compiler/abstractinterpretation.jl:952
abstract_call at ./compiler/abstractinterpretation.jl:1056
abstract_call at ./compiler/abstractinterpretation.jl:1040
abstract_eval_statement at ./compiler/abstractinterpretation.jl:1167
typeinf_local at ./compiler/abstractinterpretation.jl:1462
typeinf_nocycle at ./compiler/abstractinterpretation.jl:1520
_typeinf at ./compiler/typeinfer.jl:214
typeinf at ./compiler/typeinfer.jl:209
typeinf_edge at ./compiler/typeinfer.jl:806
abstract_call_method at ./compiler/abstractinterpretation.jl:490
abstract_call_gf_by_type at ./compiler/abstractinterpretation.jl:143
abstract_call at ./compiler/abstractinterpretation.jl:1054
abstract_call at ./compiler/abstractinterpretation.jl:1040
abstract_eval_statement at ./compiler/abstractinterpretation.jl:1167
typeinf_local at ./compiler/abstractinterpretation.jl:1462
typeinf_nocycle at ./compiler/abstractinterpretation.jl:1520
_typeinf at ./compiler/typeinfer.jl:214
typeinf at ./compiler/typeinfer.jl:209
typeinf_edge at ./compiler/typeinfer.jl:806
abstract_call_method at ./compiler/abstractinterpretation.jl:490
abstract_call_gf_by_type at ./compiler/abstractinterpretation.jl:143
abstract_call at ./compiler/abstractinterpretation.jl:1054
abstract_call at ./compiler/abstractinterpretation.jl:1040
abstract_eval_statement at ./compiler/abstractinterpretation.jl:1167
typeinf_local at ./compiler/abstractinterpretation.jl:1447
typeinf_nocycle at ./compiler/abstractinterpretation.jl:1520
_typeinf at ./compiler/typeinfer.jl:214
typeinf at ./compiler/typeinfer.jl:209
typeinf_ext at ./compiler/typeinfer.jl:892
typeinf_ext_toplevel at ./compiler/typeinfer.jl:925
typeinf_ext_toplevel at ./compiler/typeinfer.jl:921
jfptr_typeinf_ext_toplevel_11290.clone_1 at /usr/lib/julia/sys.so (unknown line)
_jl_invoke at /buildworker/worker/package_linux64/build/src/gf.c:2237 [inlined]
jl_apply_generic at /buildworker/worker/package_linux64/build/src/gf.c:2419
jl_apply at /buildworker/worker/package_linux64/build/src/julia.h:1703 [inlined]
jl_type_infer at /buildworker/worker/package_linux64/build/src/gf.c:298
jl_generate_fptr at /buildworker/worker/package_linux64/build/src/jitlayers.cpp:340
jl_compile_method_internal at /buildworker/worker/package_linux64/build/src/gf.c:1970
jl_compile_method_internal at /buildworker/worker/package_linux64/build/src/gf.c:2236 [inlined]
_jl_invoke at /buildworker/worker/package_linux64/build/src/gf.c:2229 [inlined]
jl_apply_generic at /buildworker/worker/package_linux64/build/src/gf.c:2419
jl_apply at /buildworker/worker/package_linux64/build/src/julia.h:1703 [inlined]
start_task at /buildworker/worker/package_linux64/build/src/task.c:839
unknown function (ip: (nil))
Allocations: 353890767 (Pool: 353839962; Big: 50805); GC: 237

ParadaCarleton · April 6, 2021, 11:53pm

Neither does replacing vectorization with a for loop, btw:

##

@model maskingModel(dataset) = begin
    
    # Data
    dataSize = nrow(dataset)
    n = 550 # Observed 550 people to see if they were wearing masks
    
    maskCounts = tzeros(Union{UInt16, Missing}, dataSize)
    maskEsts = tzeros(Float64, dataSize)
    villages = dataset[:vilIndex]
    for i in 1:dataSize
        maskCounts[i] = dataset[:maskCount][i]
        maskEsts[i] = dataset[:wearsMaskEst][i]
    end

    
    # Set hyperpriors
    μ ~ Normal(-4.5, .5) # Weakly informative prior: ~100% of people in rural Cameroon didn't wear masks in August (from correspondents), claim very certain.
    # 3σ credible interval is "Between 0.1% and 5% of people use masks"
    τ ~ Gamma(4, 1 / 4 / .3^2)
    σ = τ^-.5

    timeμ ~ Normal(1, 1) # Weakly informative prior: Mask use will increase to ~ 3% (170% increase), 3-sigma credible interval up to 18%
    timeτ ~ Gamma(4, 1 / 4 / .2^2) # Weakly informative prior -- standard deviation in random variation is about 20%
    timeσ = timeτ^-.5

    effectμ ~ Normal(0, 1) # Regularizing prior. Assumes effect size will shrink, but is very wide
    effectτ ~ Gamma(4, 1 / 4 / .2^2) # Weakly informative prior -- standard deviation in effect size is about 20%
    effectσ = effectτ^-.5

    estimateBiasμ ~ Normal(.5, 1) # People usually overestimate small numbers
    estimateBiasτ ~ Gamma(4, 1 / 4 / .2^2)
    estimateBiasσ = effectτ^-.5
    estimateτ ~ Gamma(4, 1 / 4 / .2^2)
    estimateσ = estimateτ^-.5

    campaignBiasμ ~ Normal(.5, 1) # Campaign will probably make people overestimate mask wearing more 
    campaignBiasτ ~ Gamma(4, 1 / 4 / .2^2)
    campaignBiasσ = campaignBiasτ^-.5


    constant ~ filldist(NormalCanon(μ * τ, τ), numVillages)
    randomChange ~ filldist(NormalCanon(timeμ * timeτ, timeτ), numVillages)
    campaignEffect ~ filldist(NormalCanon(effectμ * effectτ, effectτ), numVillages)

    estimateBias ~ filldist(NormalCanon(estimateBiasμ * estimateBiasτ, estimateBiasτ), numVillages)
    campaignBias ~ filldist(NormalCanon(campaignBiasμ * campaignBiasτ, campaignBiasτ), numVillages)


    # Add variables on logit scale to make prediction
    prediction = @. begin
        constant[villages] + 
        randomChange[villages] * dataset[:time] + 
        campaignEffect[villages] * dataset[:followup] * dataset[:campaign]
    end

    predictedEst = @. begin
        prediction + estimateBias[villages] + campaignBias[villages] * dataset[:followup] * dataset[:campaign]
    end

    # Likelihood functions
    for i in 1:dataSize
        maskCounts[i] ~ BinomialLogit(n, prediction[i])
        maskEsts[i] ~ LogitNormal(predictedEst[i], estimateσ)
    end
    
    return μ, σ, timeμ, timeσ, effectμ, effectσ, estimateBiasμ, estimateBiasσ, estimateσ, campaignBiasμ, campaignBiasσ

end

continuousParams = [:campaignBias, :campaignBiasμ, :campaignBiasτ, 
:campaignEffect, :effectμ, :effectτ, :constant, :estimateBias, 
:estimateBiasμ, :estimateBiasτ, :estimateτ, 
:maskEsts, :randomChange, :timeμ, :timeτ, :μ, :τ
]

##
chains = sample(maskingModel(dataset), Gibbs(
    MH(:maskCounts), 
    NUTS(20_000, .96, continuousParams...)
    ), MCMCThreads(), 60_000, Threads.nthreads())

ParadaCarleton · April 7, 2021, 12:02am

Trying to use MH instead throws a completely different error:

##
chains = sample(maskingModel(dataset), Gibbs(MH(:maskCounts), NUTS()), MCMCThreads(), 1000, Threads.nthreads())

ERROR: LoadError: TaskFailedException

    nested task error: TaskFailedException
    Stacktrace:
     [1] wait
       @ ./task.jl:317 [inlined]
     [2] threading_run(func::Function)
       @ Base.Threads ./threadingconstructs.jl:34
     [3] macro expansion
       @ ./threadingconstructs.jl:93 [inlined]
     [4] macro expansion
       @ ~/.julia/packages/AbstractMCMC/oou1a/src/sample.jl:304 [inlined]
     [5] (::AbstractMCMC.var"#30#40"{Bool, Base.Iterators.Pairs{Symbol, UnionAll, Tuple{Symbol}, NamedTuple{(:chain_type,), Tuple{UnionAll}}}, Int64, Int64, Vector{Any}, Vector{UInt64}, Vector{DynamicPPL.Sampler{Gibbs{(:maskCounts,), Tuple{MH{(:maskCounts,), NamedTuple{(), Tuple{}}}, NUTS{Turing.Core.ForwardDiffAD{40}, (), AdvancedHMC.DiagEuclideanMetric}}}}}, Vector{DynamicPPL.Model{var"#7#8", (:dataset,), (), (), Tuple{DataFrame}, Tuple{}}}, Vector{Random._GLOBAL_RNG}})()
       @ AbstractMCMC ./task.jl:406
    
        nested task error: TypeError: in typeassert, expected Union{Missing, UInt16}, got a value of type ForwardDiff.Dual{Nothing, UInt16, 10}

Topic		Replies	Views
Trouble translating a simple PyMC3 model to Turing New to Julia turing	39	5090	November 23, 2021
How to implement a model with missing observations in Turing? Probabilistic Programming	6	546	March 12, 2023
Turing, Inexacterror for DiscreteUniform distribution, with NUTS sampler Probabilistic Programming turing , distributions	5	501	January 4, 2021
How to handle missing values in Turing.jl Probabilistic Programming turing	1	51	August 6, 2025
Inferring missing values in a Turing.jl model Statistics	1	183	March 27, 2024

Imputing Discrete Data

Related topics