Modified version of Poisson NMF modeling via NUTS sampler is very slow

hahawalk · June 28, 2022, 1:44pm

Hi! I am new to Turing.jl and trying to using NUTS to infer the parameters from a modified version of Poisson Non-negative Matrix Factorization(NMF) model. The demo codes are listed as below. It worked for a small number of draws but when I tried 500 draws, it took me like 12 hours to run the sampler and got killed. I wonder if any part of my codes is incorrect or there are any ways to accelerate the sampling procedure? I really appreciate your help!

eps_k = 200.0
a0_k = 6.0
b0_k = eps_k*(a0_k - 1)

alpha = 1.0
a1 = 1.0
b1 = 1.0

@model function PoissonNMF(x, r_k, ::Type{T} = Float64) where {T}

  mu = Vector{T}(undef, K)
  a = Vector{T}(undef, J)
  t = Matrix{T}(undef, J, K)
  r_inf = Matrix{T}(undef, I, K_u) 

  for j = 1:J
     a[j] ~ InverseGamma(a1, b1)
  end 

  for i = 1:K
       mu[i] ~ InverseGamma(a0_k,b0_k)
  end

  for j = 1:J, k = 1:K
    t[j,k] ~ Gamma(a[j],mu[k]/a[j])
  end

  for kk = 1:K_k
    r_k[:,kk] ~ Dirichlet(I, alpha)
  end

  for ku = 1:K_u
    r_inf[:,ku] ~ Dirichlet(I, alpha)
  end

  r = hcat(r_k, r_inf)
  rate = r*t'

  for i = 1:I, j = 1:J
    x[i,j] ~ Poisson(rate[i,j])
  end
end

model = PoissonNMF(x, r_k)
draws = 500
acp_prob = 0.65
lf_steps = 5
ts = time()
println("The sampling starts")
flush(stdout)
chn = sample(model, NUTS(lf_steps, acp_prob), draws, init_ϵ = 0.000389862060546875)

Christopher_Fisher · June 28, 2022, 9:42pm

Hi,

In general, eliminating global variables will help improve speed (e.g., J, K a1, etc.) . In this particular case, I’m not sure that is the primary problem.

Another potential issue is the auto-diff backend. In Turing, the default forward mode works well for a low number of parameters, such as 5-10. In your case, it looks like you might have much more. If that is true, I recommend using ReverseDiff. Unfortunately, it requires you to vectorize your code in order to get good performance. That might provide an improvement of several orders of magnitude.

If you can provide a fully runnable example, someone might be able to provide more guidance.

hahawalk · July 5, 2022, 3:47pm

Hi! Thanks for the comments! I have checked how to vectorize my codes and still am confused about vectorizing the parameters like this part

for j = 1:J, k = 1:K
    t[j,k] ~ Gamma(a[j],mu[k]/a[j])
  end

Yeah, here is a demo code with synthetic data.

using Turing
using Optim
using Printf

x = [67 15 79 114 33; 21 16 43 41 4; 7 11 14 17 6; 1 3 5 3 0; 119 22 89 193 80; 28 5 27 48 31; 19 4 14 30 16; 1 4 13 4 2; 13 11 18 16 14; 14 18 49 47 16; 106 17 68 131 86; 4 7 17 11 4]
println(size(x))

r_k = [0.0955774895744073 0.18949879345012058; 7.189496934450597e-9 0.0009537832013093777; 0.01183664879213095 0.0038035674109844556; 0.01906680209476258 0.0012755013375099333; 0.0865186633906054 0.3979584390193475; 0.09657592077201838 0.07261594631055265; 0.022973198407369454 0.07093309190047935; 0.00041223552282597023 4.410682039031856e-5; 0.09570716261434926 0.010371759404443884; 0.010831561823427063 0.001405200855652656; 0.5065116350310219 0.2402186872546876; 0.053988674787584665 0.010921123034521816]
println(size(r_k))


#####hyperparameter set-up######
eps_k = 200.0
a0_k = 6.0
b0_k = eps_k*(a0_k - 1)

eps_u = 200.0
a0_u = 6.0
b0_u = eps_u*(a0_u - 1)

alpha = 1.0
a1 = 1.0
b1 = 1.0

@model function PoissonNMF(x, r_k, ::Type{T} = Float64) where {T}

  I,J = size(x)
  K = 4
  _,K_k = size(r_k)
  K_u = K - K_k
  mu = Vector{T}(undef, K)
  a = Vector{T}(undef, J)
  t = Matrix{T}(undef, J, K)
  r_inf = Matrix{T}(undef, I, K_u)

  for j = 1:J
     a[j] ~ InverseGamma(a1, b1)
  end

  for i = 1:K
       mu[i] ~ InverseGamma(a0_k,b0_k)
  end

  for j = 1:J, k = 1:K
    t[j,k] ~ Gamma(a[j],mu[k]/a[j])
  end

  for kk = 1:K_k
    r_k[:,kk] ~ Dirichlet(I, alpha)
  end

  for ku = 1:K_u
    r_inf[:,ku] ~ Dirichlet(I, alpha)
  end

  r = hcat(r_k, r_inf)
  rate = r*t'

  for i = 1:I, j = 1:J
    x[i,j] ~ Poisson(rate[i,j])
  end
end

model = PoissonNMF(x, r_k)
draws = 500
acp_prob = 0.65
lf_steps = 5
ts = time()
println("The sampling starts")
flush(stdout)
chn = sample(model, NUTS(lf_steps, acp_prob), draws, init_ϵ = 0.000389862060546875)

ElOceanografo · July 5, 2022, 4:35pm

Check out this section of the Turing performance tips:
https://turing.ml/dev/docs/using-turing/performancetips#special-care-for-codetrackercode-and-codezygotecode
Instead of preallocating your arrays and then filling them using a loop, you probably want to use filldist (for your priors) and arraydist (for the Poisson observations). For instance, you’d initialize the prior for a like

a ~ filldist(InverseGamma(a1, b1), J)

and you’d write the observation likelihood as

x ~ arraydist(Poisson.(rate))

hahawalk · July 6, 2022, 1:41am

Thanks! I vectorized priors a, mu, r_k and r_inf. It works pretty well!
May I ask if there are any ways to vectorize t?

ElOceanografo · July 6, 2022, 3:21pm

This should work:

t ~ arraydist(Gamma.(a, mu' ./ a))

Doing mu' ./ a will expand to a J x K matrix, and then the call to Gamma.() knows to broadcast a over each column.

Topic		Replies	Views
Turing: help with slow model Probabilistic Programming	46	2638	February 13, 2021
Advice for simple (mildly large) Turing model Probabilistic Programming question	9	2138	December 26, 2019
NUTS speed is very slow for high dimension parameter inference in Turing.jl Probabilistic Programming turing	8	2334	May 13, 2022
Making Turing Fast with large numbers of parameters? General Usage turing	127	9568	April 9, 2023
Slow Turing.jl sampling compared to python pymc Probabilistic Programming turing	7	127	June 26, 2025

Modified version of Poisson NMF modeling via NUTS sampler is very slow

Related topics