Hi, I am new to Julia and struggling to formulate my model correctly. My model consists of:
-
A regressor x_t
-
An intermediate variable y_t = k * x_t + \eta, \eta ~ Normal(0, \sigma)
And finally a target variable z that is a function of y_t and its lags:
- z_t = \lambda * y_t + \lambda ^ 2 * y_(t-1) + \lambda ^ 3 * y_(t - 2) + …, \lambda ~ Uniform(0, 1)
The variables of interest are k, \eta and \lambda.
My issue is that I am unable to tell Julia that z_t is distributed as above, because by the time I get to defining it, the terms are numbers instead of distributions.
It might be possible to re-parametrize and define z_t as a normally distributed variable with mean equal to z_t above, and some variance that is a function of \eta and \lambda. However I’m not willing to attempt to find this function because I’m sure there is a better way.
Below is a version of my code that produces semi-reasonable code, using 4 lags. I’ve attempted to use Distributions.convolute() multiple times to combine distributions, but I’m not sure how correct this is, and again I’m sure there is a better way.
using Turing, StatsPlots, Random
Random.seed!(123)
function lag_transform(λ, y)
# apply the lag transformation to y
y_type = promote_type(eltype(λ), eltype(y))
z = zeros(y_type, 0)
for i in 1:length(y)
powers = reverse([λ ^ k for k in range(1, stop=i)])
previous_terms = zeros(y_type, 0)
for j in 1:i
append!(previous_terms, y[j] * powers[j])
end
current_z_val = sum(previous_terms)
append!(z, current_z_val)
end
return z
end
# generate some test data
n_samples = 200
σ = 0.5
k = 2.3
λ = 0.7
x = 1:n_samples
y = [x_val * k + rand(Normal(0, σ)) for x_val in x]
z = lag_transform(λ, y)
@model function model_func(x, z)
k ~ Uniform(0.5, 3)
λ ~ Uniform(0.4, 1)
σ ~ truncated(Normal(0, 1), 0, Inf)
y = TArray([convert(Real, k * x_val) for x_val in x])
powers = TArray([convert(Real, λ ^ i) for i in 1:5])
for n in 5:length(y) # start on 5, because we use the previous 4 lags
normals = [
Normal(powers[1] * y[n], powers[1] * σ),
Normal(powers[2] * y[n - 1], powers[2] * σ),
Normal(powers[3] * y[n - 2], powers[3] * σ),
Normal(powers[4] * y[n - 3], powers[4] * σ),
Normal(powers[5] * y[n - 4], powers[5] * σ)
]
z[n] ~ reduce(convolve, normals)
end
end
model = model_func(x, z[5:end]) # dont pass all of z_t, because we use the previous 4 lags
chain = sample(model, NUTS(), 5000)
plot(chain)
I’ve also tried some other solutions (looking into @submodels, !addlogprob, etc.) but have been completely unsuccessful. Thanks!