Thanks for the detailed explanation @torfjelde . Below is the original code that I used to represent the DINA model using Turing. Here based on alpha, I either choose the guess or (1-slip) continuous random variable to represent the probability of the X. This code is giving me the same problem of getting exactly the same samples. Increasing warmups help but not reliably. Marginalizing over alpha and using NUTS (based on my Pyro code) does help reliably, even parameter recovery is great, but doesn’t scale if alpha is more than 100 or so students and have more than 5 skills. I wonder should I:
- Try to think of implementing Alpha as continuous rather than discrete so that I can use NUTS.
- Try to marginalize over alpha based on the code recommendation given by @Christopher_Fisher and see how far can I scale the model.
- Keep alpha discrete but not go for differentiation-based algo like NUTS and try for Particle-based algos so that I don’t need to use Compositional Sampling. I did try Importance sampling algo in Turing. Although it worked but parameter recovery was not so great.
For my parameter recovery test, I generate X using my implementation of the DINA model but parameters are known beforehand and belong to the same distribution as in the model. Then I give the X back to the model and see if similar parameter values can be recovered using posterior sampling (I take mean of posterior to compare).
Thanks for all the comments. It’s helping me to think further about my problem.
In the code below,
- i = number of students
- j = number of questions
- k = number of skills required to answer the questions
- Q is {j,k} dims and contains {0,1}. Q represents which set of k skills does the jth question requires.
- X is {i,k} dims and contains {0,1}. X represents whether the answer given by ith student for the jth question is correct or not.
This implementation is similar to the one suggested in Bayesian Estimation of the DINA Model With Gibbs Sampling
@model function dina_model(i,X,Q)
j,k = size(Q)
#Item parameters
guess ~ filldist(Beta(1,1), 1,j)
slip ~ filldist(Beta(1,1), 1,j)#Student and skill parameters
alpha ~ filldist(Bernoulli(0.5),i,k)#Calculating skills required to answer correctly
skills_mastered = alpha * Q’
skills_required = sum(Q, dims = 2)#Represents ideal situation to answer the question i.e. when student is not cheating
eta = skills_mastered .== skills_required’#Selecting either a or b based on eta
a = guess.^([1].-eta)
b = ([1].-slip).^etaprob_X = a.*b X ~ arraydist(Bernoulli.(prob_X)) return X
end