# Modified version of Poisson NMF modeling via NUTS sampler is very slow

Hi! I am new to Turing.jl and trying to using NUTS to infer the parameters from a modified version of Poisson Non-negative Matrix Factorization(NMF) model. The demo codes are listed as below. It worked for a small number of draws but when I tried 500 draws, it took me like 12 hours to run the sampler and got killed. I wonder if any part of my codes is incorrect or there are any ways to accelerate the sampling procedure? I really appreciate your help!

``````eps_k = 200.0
a0_k = 6.0
b0_k = eps_k*(a0_k - 1)

alpha = 1.0
a1 = 1.0
b1 = 1.0

@model function PoissonNMF(x, r_k, ::Type{T} = Float64) where {T}

mu = Vector{T}(undef, K)
a = Vector{T}(undef, J)
t = Matrix{T}(undef, J, K)
r_inf = Matrix{T}(undef, I, K_u)

for j = 1:J
a[j] ~ InverseGamma(a1, b1)
end

for i = 1:K
mu[i] ~ InverseGamma(a0_k,b0_k)
end

for j = 1:J, k = 1:K
t[j,k] ~ Gamma(a[j],mu[k]/a[j])
end

for kk = 1:K_k
r_k[:,kk] ~ Dirichlet(I, alpha)
end

for ku = 1:K_u
r_inf[:,ku] ~ Dirichlet(I, alpha)
end

r = hcat(r_k, r_inf)
rate = r*t'

for i = 1:I, j = 1:J
x[i,j] ~ Poisson(rate[i,j])
end
end

model = PoissonNMF(x, r_k)
draws = 500
acp_prob = 0.65
lf_steps = 5
ts = time()
println("The sampling starts")
flush(stdout)
chn = sample(model, NUTS(lf_steps, acp_prob), draws, init_ϵ = 0.000389862060546875)
``````

Hi,

In general, eliminating global variables will help improve speed (e.g., J, K a1, etc.) . In this particular case, I’m not sure that is the primary problem.

Another potential issue is the auto-diff backend. In Turing, the default forward mode works well for a low number of parameters, such as 5-10. In your case, it looks like you might have much more. If that is true, I recommend using ReverseDiff. Unfortunately, it requires you to vectorize your code in order to get good performance. That might provide an improvement of several orders of magnitude.

If you can provide a fully runnable example, someone might be able to provide more guidance.

Hi! Thanks for the comments! I have checked how to vectorize my codes and still am confused about vectorizing the parameters like this part

``````for j = 1:J, k = 1:K
t[j,k] ~ Gamma(a[j],mu[k]/a[j])
end
``````

Yeah, here is a demo code with synthetic data.

``````using Turing
using Optim
using Printf

x = [67 15 79 114 33; 21 16 43 41 4; 7 11 14 17 6; 1 3 5 3 0; 119 22 89 193 80; 28 5 27 48 31; 19 4 14 30 16; 1 4 13 4 2; 13 11 18 16 14; 14 18 49 47 16; 106 17 68 131 86; 4 7 17 11 4]
println(size(x))

r_k = [0.0955774895744073 0.18949879345012058; 7.189496934450597e-9 0.0009537832013093777; 0.01183664879213095 0.0038035674109844556; 0.01906680209476258 0.0012755013375099333; 0.0865186633906054 0.3979584390193475; 0.09657592077201838 0.07261594631055265; 0.022973198407369454 0.07093309190047935; 0.00041223552282597023 4.410682039031856e-5; 0.09570716261434926 0.010371759404443884; 0.010831561823427063 0.001405200855652656; 0.5065116350310219 0.2402186872546876; 0.053988674787584665 0.010921123034521816]
println(size(r_k))

#####hyperparameter set-up######
eps_k = 200.0
a0_k = 6.0
b0_k = eps_k*(a0_k - 1)

eps_u = 200.0
a0_u = 6.0
b0_u = eps_u*(a0_u - 1)

alpha = 1.0
a1 = 1.0
b1 = 1.0

@model function PoissonNMF(x, r_k, ::Type{T} = Float64) where {T}

I,J = size(x)
K = 4
_,K_k = size(r_k)
K_u = K - K_k
mu = Vector{T}(undef, K)
a = Vector{T}(undef, J)
t = Matrix{T}(undef, J, K)
r_inf = Matrix{T}(undef, I, K_u)

for j = 1:J
a[j] ~ InverseGamma(a1, b1)
end

for i = 1:K
mu[i] ~ InverseGamma(a0_k,b0_k)
end

for j = 1:J, k = 1:K
t[j,k] ~ Gamma(a[j],mu[k]/a[j])
end

for kk = 1:K_k
r_k[:,kk] ~ Dirichlet(I, alpha)
end

for ku = 1:K_u
r_inf[:,ku] ~ Dirichlet(I, alpha)
end

r = hcat(r_k, r_inf)
rate = r*t'

for i = 1:I, j = 1:J
x[i,j] ~ Poisson(rate[i,j])
end
end

model = PoissonNMF(x, r_k)
draws = 500
acp_prob = 0.65
lf_steps = 5
ts = time()
println("The sampling starts")
flush(stdout)
chn = sample(model, NUTS(lf_steps, acp_prob), draws, init_ϵ = 0.000389862060546875)

``````

Check out this section of the Turing performance tips:
https://turing.ml/dev/docs/using-turing/performancetips#special-care-for-codetrackercode-and-codezygotecode
Instead of preallocating your arrays and then filling them using a loop, you probably want to use `filldist` (for your priors) and `arraydist` (for the Poisson observations). For instance, you’d initialize the prior for `a` like

``````a ~ filldist(InverseGamma(a1, b1), J)
``````

and you’d write the observation likelihood as

``````x ~ arraydist(Poisson.(rate))
``````
2 Likes

Thanks! I vectorized priors a, mu, r_k and r_inf. It works pretty well!
May I ask if there are any ways to vectorize t?

This should work:

``````t ~ arraydist(Gamma.(a, mu' ./ a))
``````

Doing `mu' ./ a` will expand to a J x K matrix, and then the call to `Gamma.()` knows to broadcast `a` over each column.