How to make a model applicable to Turing.jl?

I’ve implemented a model, for which I would like to use Turing.jl to evaluate the posterior distribution of the model parameters. The problem I’m facing is that the way I’m storing the model parameters (around 20-40, depending on the model input) does not allow the direct application of Turing.jl, as this package requires my model to accept variables of type ForwardDiff.Dual (if my interpretation of the error message is correct). Currently, I’m storing the parameters whithin a dictionary and a DataFrame object that is contained in this dictionary.

My question is: What data structure and types should I use instead to allow for the usage of Turing.jl?

No problems storing duals in either Dicts or DataFrames:

julia> using DataFrames, ForwardDiff

julia> DataFrame(x = [ForwardDiff.Dual(0, 1)], y = [Dict(1 => ForwardDiff.Dual(1, 2))])
1×2 DataFrame
 Row │ x                   y                                 
     │ Dual…               Dict…                             
   1 │ Dual{Nothing}(0,1)  Dict{Int64, ForwardDiff.Dual{Not…

so you’ll have to give us an MWE to help.

Thanks for your quick answer nilshg. This is good to know!

I tried to make a MWE of my problem:

using Distributions
using FillArrays
using StatsPlots
using LinearAlgebra
using Random
using Parameters
using Turing
using MCMCChains
using Roots
using DataFrames
using LabelledArrays

# Set a random seed to ensure reproducibility of the results

function toy_function(y, p)
    (x,params) = p
    return x.^2 .* params.value[1] .+ y .* params.value[2] .+ params.value[3]

function toy_model(x, params)
    sol = zeros(eltype(params.value), length(x))
    for (idx, x_value) in enumerate(x)
        zero_problem = ZeroProblem(toy_function, zero(eltype(params.value)))
        sol[idx] = solve(zero_problem, Order2(), p=(x_value,params), maxevals=1000)
    return sol

# Define prior distributions
param_prior_distributions = DataFrame(name=["a","b","c"],
                                      distr=[Normal(3,10), Normal(1,2), Normal(3,4)])
true_parameter_values = DataFrame(name=["a","b","c"],

# Define the measurement model
normalmodel = Normal(0, 1)

# Evaluate synthetic measurement data
Nx = 100
x_values = collect(range(-2, 2, length=Nx))
y_values = toy_model(x_values, true_parameter_values) + rand(normalmodel, Nx)

# Plot measurement data
scatter(x_values, y_values; legend=false, title="Synthetic Dataset")

@model function turing_toy_model(x, y, param_prior_distributions)

    params = DataFrame(name=["a","b","c"], value=[0.0, 0.0, 0.0])
    for (idx, _) in enumerate(param_prior_distributions.distr)
        params.value[idx] ~ param_prior_distributions.distr[idx]

    mu = toy_model(x, params)
    N = length(x)
    for i in 1:N
        y[i] ~ Normal(mu[i], std(normalmodel))
    return y


my_turing_toy_model = turing_toy_model(x_values, y_values, param_prior_distributions);

# perform sampling
nsamples = 100
chain = sample(my_turing_toy_model, NUTS(0.65), nsamples);


a_posteriori_mean_values = describe(chain)[1][:,:mean];
a_posteriori_std_values = describe(chain)[1][:,:std];

println("true parameter values: ", true_parameter_values)
println("a_posteriori_mean_values:", a_posteriori_mean_values)
println("a_posteriori_std_values:", a_posteriori_std_values)

y_values_exact = toy_model(x_values, true_parameter_values)
y_values_prior = toy_model(x_values, model_params)
df_a_posteriori_mean_values =DataFrame(name=["a","b","c"], value=a_posteriori_mean_values)
y_values_posterior = toy_model(x_values, df_a_posteriori_mean_values)

plot(x_values, y_values_exact; legend=:bottomright, title="Model Predictions", label=["Exact values"])
scatter!(x_values, y_values_prior, label=["Prior"])
scatter!(x_values, y_values_posterior, label=["Posterior"])

Unfortunately, the above example has another issue: When trying to use a dataframe inside the @model function I get the error:

ArgumentError: column name :columns not found in the data frame