 # Tips for large-ish latent variable model in Turing

Hi - I have a data set (~900 x 120) on which I’m trying to do dimension reduction. Eventually, I want to test some factor mixture models, but right now, I simply want to do a basic factor analysis. The data set isn’t huge, but it’s large enough that I’m not expecting things to be fast. However, it’s even slower than I expect.

Here’s a MWE to give a sense of things, on 1.4.2, Windows 64-bit:

``````using Turing, Optim
data1 = randn(900, 120)

# p = the dimension of the factor model
# this is just an example; we ignore rotational invariance issues

@model factormodel(dat, p) = begin
N, K = size(dat)

σ ~ filldist(InverseGamma(1, 1), K)  # factor uniquenesses
μ ~ filldist(Normal(0, 10), K)             # variable-specific intercept
F ~ filldist(Normal(0,1), N, p)           # latent factors

mu = F * λ

for k in 1:K
for i in 1:N
dat[i,k] ~ Normal(μ[k] + mu[i,k], σ[k])
end
end
end;

#=
(1) MLE/MAP estimates take an *extraordinarily* long time, especially compared to e.g., R's factanal() or fa() routines. I would expect it to take a few seconds, maybe. Instead, it takes so long I get tired of waiting and cancel. :( I'm not willing to wait overnight for a model like this, ha ha.
(2) standard NUTS estimation also takes a *very* long time (and seems to spend a lot of time generating values that are "rejected for numerical errors").
(3) basic MH is slightly slow, but usable (-ish).
(4) Going from ForwardDiff to a reverse-mode AD (like Zygote) does not change anything.
=#

# Trying a reverse diff backend like Zygote doesn't change anything.
# using Zygote

map_estimate = optimize(factormodel(data1, 1), MAP())
hmc_estimate = sample(factormodel(data1, 1), NUTS(0.65), 2000)
mh_estimate = sample(factormodel(data1, 1), MH(), 20000)

# Also tested ADVI on master, but gives errors
``````

I tested on both the current stable Turing (0.13.0) and master, with same results. I certainly don’t expect this kind of model to be fast, but I would expect it to run in a feasible time. A simple one-factor model only has 900+240 parameters, which is a good number, but not outlandish.

Is there something obvious I’m missing or doing very wrong? I have read the manual’s section on optimization, and I know the one loop isn’t ideal, but I’m not sure what to replace it with, or if it’s the real culprit here. I’d welcome any advice people have.

I would use `LazyArrays` and `arraydist` for that last line to vectorize and fuse everything including the AD. See https://github.com/TuringLang/TuringExamples/blob/290a2c6dcd711bdd755aafb70b8f1441518b64e4/benchmarks/logistic_reg/turing.jl for an example. Loops are generally fast in Julia, but they are still slow in reverse-mode AD. Forward-mode AD is slow for large models with more than a few hundred unobserved random variables. So if you want to make reverse-mode AD fast in Turing or Julia in general, you still need to figure out vectorization and fusing tricks. I find the fact that `LazyArrays` and `arraydist` seamlessly work together to be a powerful trick to help with that. I would also suggest you use ReverseDiff if it works as the AD package of choice. Zygote is excellent but typically has more performance issues.

1 Like

Thanks! I’ll give it a try.

Following up on this: by replacing the nested loops using `LazyArrays` / `arraydist`trick, I got a ~25x speedup for Metropolis-Hastings sampling:

``````@model factormodel1(dat, p) = begin
N, K = size(dat)

σ ~ filldist(InverseGamma(1, 1), K)  # factor uniquenesses
μ ~ filldist(Normal(0, 10), K)             # variable-specific intercept
F ~ filldist(Normal(0,1), N, p)           # latent factors

mu = F * λ .+ μ'

dat ~ arraydist(LazyArray(Base.broadcasted((m, s) -> Normal(m, s), mu, σ')))
end
@time mh_estimate = sample(factormodel(data1, 1), MH(), 200) # ~20 s
@time mh_estimate1 = sample(factormodel1(data1, 1), MH(), 200) # ~0.8 s
``````

Most of the speedup comes from `arraydist` rather than `LazyArrays`: when I replaced the last line of the `factormodel1` with

``````dat ~ arraydist([Normal(mu[i, k], σ[k]) for i in 1:N, k in 1:K])
``````

it was only about 10% slower than with the `arraydist(LazyArray(...))` approach. Of course, `MH` doesn’t explore the high-dimensional parameter space well at all, so in practice it produces barely any useful samples.

Unfortunately, sampling with `NUTS` is still extremely slow and produces a lot of “numerical errors” warnings. Drawing just 20 samples using `NUTS` took 291 seconds on my computer. However, I was able to get decent run times and sample sizes by playing around with different Gibbs schemes…the following ran in 197 seconds and appeared to converge after about 50 samples:

``````@time gibbs_estimate = sample(factormodel1(data1, 1),
Gibbs(NUTS(100, 0.35, :σ), PG(30, :λ), NUTS(100, 0.35, :μ), PG(30, :F)),
200)
``````

This was arrived at by trial and error, so there are probably more efficient ways to do it.

EDIT: hit post too soon