Fitting multiple datasets simultaneously

ElOceanografo · September 8, 2025, 10:32pm

Here’s an example I made up for myself recently when I was trying to figure out how to do this. There are n = 5 datasets here that are quadratic regressions. The linear coefficient b and residual nosise σ are fitted independently for each dataset inside the Regression submodel, while the intercept a and quadratic coefficient c are defined at the top level in MultiRegression and passed as data to the submodels.

using Turing
using StatsPlots

n = 5
a = 1.0 + 2randn()
bb = 2.0 .+ 2randn(n)
c = 5randn()

xx = [2rand(100) for _ in 1:n]
yy = [a .+ b.*x .+ c.*x.^2 .+ randn.() for (b, x) in zip(bb, xx)]
scatter(xx, yy)

@model function Regression(x, y, pars)
    b ~ Normal()
    σ ~ Gamma()
    μ = pars.a .+ b .* x .+ pars.c .* x.^2
    y ~ MvNormal(μ, σ)
end

@model function MultiRegression(xx, yy)
    n = length(xx)
    a ~ Normal()
    c ~ Normal()

    stocks = Vector(undef, n)
    for i in 1:n
        pars = (; a, c)
        stocks[i] ~ to_submodel(Regression(xx[1], yy[i], pars))
    end
end

model = MultiRegression(xx, yy)

chain = sample(model, NUTS(), 1000)
plot(chain)

This seems to work pretty well, though there may be better ways to do it I don’t know about.

Topic		Replies	Views
Turing with multiple replicates of data New to Julia turing	0	296	March 20, 2023
Turing.jl or SciML?: training and testing on different datasets Probabilistic Programming question	0	488	September 17, 2021
Parameter Fitting in Turing Probabilistic Programming turing	15	1522	October 12, 2022
Difficulty using Submodels in Turing Probabilistic Programming	4	140	November 4, 2025
How to specify a Turing model to have the 2 vectors as input for a distribution that returns a tuple Probabilistic Programming turing , distributions , lba , sequantial-models	19	556	July 1, 2023

Fitting multiple datasets simultaneously

Related topics