Sample dependent variables in Turing

I’d like to sample two variables X1 ~ Uniform(0,1) and X2 ~ Uniform(0, X1)

using Turing
@model function model()
X1 ~ Uniform(1e-5, 1)
X2 ~ Uniform(0, X1)
end
my_model = model(); sample(my_model, NUTS(), 10000)

image

image

However, while the distribution of X1 looks correct (that is it follows Uniform(0,1)), the distribution of X2 is very strange and each time it changes a lot. Analytically the mean of X2 should be 0.25 but this result is never reached. Can anyone help me figure out where I did wrong?

Could this be a matter of not having enough samples? I increased the chain length 100-fold to 1_000_000 and am seeing stats like

Summary Statistics
  parameters      mean       std      mcse      ess_bulk      ess_tail      rhat   ess_per_sec
      Symbol   Float64   Float64   Float64       Float64       Float64   Float64       Float64

          X1    0.4999    0.2887    0.0003   796644.8328   663422.4909    1.0000    19973.0440
          X2    0.2497    0.2201    0.0002   765366.0740   647466.4402    1.0000    19188.8400

Thanks so much! That’s interesting. But I still have this problem.

Actually such a problem doesn’t exist if X2 ~ Normal(X1, 1) or Exponential(X1). Seems only happen if X2 ~ Uniform(0, X1).

what version of turing are you using? Mine are v0.29.3 and v0.30, with both having the same issue. Did you see the same problem when sample size is 10^4 (actually I think that’s big enough if I am not mistaken).

Thanks so much!

I thought I was getting wildly varying means at 10k samples but I’m not, my bad. Indeed, 10k seems plenty.

I’m using Turing v0.32.2 and Distributions v0.25.109.

Ah yes, this was a bug that was recently fixed. In short, the inference was actually performed correctly, but there was a bug in the code transforming the variables to the format we present it to the end-user. If you go to Turing 0.32 everything should work just fine again:)

Can confirm this works on my end with Turing.jl@0.32.3 and 10k samples (reproducible).

Did you see the same problem when sample size is 10^4 (actually I think that’s big enough if I am not mistaken).

Recommend looking at effective sample size (ESS) in the chain when considering “how many samples do I have”. Even if you run 10k iterations, the sampler could get stuck, but that would be indicated in the ESS.

When I run your code (again, on v0.32.3), with 10k iterations I consistently get >5k ESS, which is pleeeeenty for evaluating whether we’re getting the mean correctly:) And in my case, I consistently get 0.5 and 0.25 as expected :+1:

2 Likes

thank you so much! the newest version got it correct. Also thanks for ESS!

Glad to hear:)

Would you mind marking the answer as the “solution”? That way others who arrive at the same issue can easily see the answer:)

Thank you for the reminder. Marked!

1 Like