Turing: Target Variable is a Deterministic Function of Random Variables

Hi, I am new to Julia and struggling to formulate my model correctly. My model consists of:

  • A regressor x_t

  • An intermediate variable y_t = k * x_t + \eta, \eta ~ Normal(0, \sigma)

And finally a target variable z that is a function of y_t and its lags:

  • z_t = \lambda * y_t + \lambda ^ 2 * y_(t-1) + \lambda ^ 3 * y_(t - 2) + …, \lambda ~ Uniform(0, 1)

The variables of interest are k, \eta and \lambda.

My issue is that I am unable to tell Julia that z_t is distributed as above, because by the time I get to defining it, the terms are numbers instead of distributions.

It might be possible to re-parametrize and define z_t as a normally distributed variable with mean equal to z_t above, and some variance that is a function of \eta and \lambda. However I’m not willing to attempt to find this function because I’m sure there is a better way.

Below is a version of my code that produces semi-reasonable code, using 4 lags. I’ve attempted to use Distributions.convolute() multiple times to combine distributions, but I’m not sure how correct this is, and again I’m sure there is a better way.

using Turing, StatsPlots, Random

function lag_transform(λ, y)
    # apply the lag transformation to y
    y_type = promote_type(eltype(λ), eltype(y))
    z = zeros(y_type, 0)

    for i in 1:length(y)
        powers = reverse([λ ^ k for k in range(1, stop=i)])
        previous_terms = zeros(y_type, 0)
        for j in 1:i
            append!(previous_terms, y[j] * powers[j])

        current_z_val = sum(previous_terms)
        append!(z, current_z_val)
    return z

# generate some test data
n_samples = 200
σ = 0.5
k = 2.3
λ = 0.7

x = 1:n_samples
y = [x_val * k + rand(Normal(0, σ)) for x_val in x]
z = lag_transform(λ, y)

@model function model_func(x, z)
    k ~ Uniform(0.5, 3)
    λ ~ Uniform(0.4, 1)
    σ ~ truncated(Normal(0, 1), 0, Inf)

    y = TArray([convert(Real, k * x_val) for x_val in x])

    powers = TArray([convert(Real, λ ^ i) for i in 1:5])

    for n in 5:length(y) # start on 5, because we use the previous 4 lags

        normals = [
            Normal(powers[1] * y[n], powers[1] * σ),
            Normal(powers[2] * y[n - 1], powers[2] * σ),
            Normal(powers[3] * y[n - 2], powers[3] * σ),
            Normal(powers[4] * y[n - 3], powers[4] * σ),
            Normal(powers[5] * y[n - 4], powers[5] * σ)

        z[n] ~ reduce(convolve, normals)

model = model_func(x, z[5:end]) # dont pass all of z_t, because we use the previous 4 lags

chain = sample(model, NUTS(), 5000)

I’ve also tried some other solutions (looking into @submodels, !addlogprob, etc.) but have been completely unsuccessful. Thanks!