# Getting Turing to work with a hierarchical multinomial regression model

Hi everyone,

I’m sorry if this question is too specific. I tried following advice in previous similar threads but couldn’t get anything to work, so figured I should try asking myself.

For context, I’m modeling 5000 Hangman games, where players revealed a hidden word by guessing letters. Each player played games in two conditions, and my goal is to compare their behaviour in the two conditions. My approach is to fit a multinomial logistic regression model to predict letter selections with three predictors: `prior`, `post` and `eig`. Each predictor is represented as a vector of length 26 (for 26 letters). Since it is impossible to select the same letter more than once in the same game, I have an additional `mask` vector with 1s for letters that can be selected and 0 for letters that can’t be selected.

Below is my model. The arguments are:

• priors: n by 26 Float64 matrix
• posts: n by 26 Float64 matrix
• eig: n by 26 Float64 matrix
• masks: n by 26 int64 matrix (0 or 1)
• subj: n by 1 int64 vector (representing the subject number)
• condition: n by 1 int64 vector (0 or 1)
• y: n by 1 int64 vector (between 1 and 26, the index of the selected letter)
``````@model function logistic_regression(priors, posts, eig, masks, subj, condition, y)

n = size(priors, 1) # number of rows
length(y) == size(posts,1) == size(masks,1) == n ||
throw(DimensionMismatch("number of observations is not equal"))

n_subj = length(unique(subj));

# Hyperparameters for main effects
mu_prior ~ Normal(0, 10)
sigma_prior ~ Gamma(2,1)

mu_post ~ Normal(0, 10)
sigma_post ~ Gamma(2,1)

mu_eig ~ Normal(0, 10)
sigma_eig ~ Gamma(2,1)

# Hyperparameters for condition effects
mu_prior_delta ~ Normal(0, 10)
sigma_prior_delta ~ Gamma(2,1)

mu_post_delta ~ Normal(0, 10)
sigma_post_delta ~ Gamma(2,1)

mu_eig_delta ~ Normal(0, 10)
sigma_eig_delta ~ Gamma(2,1)

# Subject-level parameters: main effets
coef_prior ~ filldist(Normal(mu_prior, sigma_prior), n_subj)
coef_post ~ filldist(Normal(mu_post, sigma_post), n_subj)
coef_eig ~ filldist(Normal(mu_eig, sigma_eig), n_subj)

# Subject-level parameters: condition effects
coef_prior_delta ~ filldist(Normal(mu_prior_delta, sigma_prior_delta), n_subj)
coef_post_delta ~ filldist(Normal(mu_post_delta, sigma_post_delta), n_subj)
coef_eig_delta ~ filldist(Normal(mu_eig_delta, sigma_eig_delta), n_subj)

for i in 1:n
condition[i]*(coef_prior_delta[subj[i]]*priors[i,:]+coef_post_delta[subj[i]]*posts[i,:]+coef_eig_delta[subj[i]]*eig[i,:]))
y[i] ~ Categorical(v)
end
end;

``````

I tried fitting the model using HMC and NUTS, and both don’t really work. HMC is slow (like >5 hours for one chain of 1500 steps) and returns chains that are stuck at a specific set of coordinates, and NUTS is much slower (the current ETA is 8:30 hours to complete a chain of 100 steps). When I try to run more than one chain using `MCMCThreads()`, it doesn’t even start running.

I followed previous advice and am using `Turing.setrdcache(true)` and `Turing.setadbackend(:reversediff)` but this doesn’t seem to help.

Any advice would be greatly appreciated. Many thanks!

You might find my post on this topic helpful. I’m working on a hierarchical multinomial problem, as well. I was able to improve the performance with a few tricks from the Turing developers.

1 Like

Thanks Jason! This does seem very relevant.
In the meantime I found a faster solution that I’m satisfied with: I estimate the MLE coefficients for each subject using `optimize`, and do frequentist statistics at the group level. I don’t get posterior probabilities over group-level coefficients, but then again I’m not sure that’s what I’m after in the first place Thanks again,
-Matan