Hey guys!
Does anybody know what is the most efficient way of sampling from a multivariate normal distribution whose parameters are constantly updated. Iβm constructing a Gibbs sampler for a MVN whose parameters depend on matrix multiplication and inversion, thus I want to make things as efficient as possible.
For now, Iβve considered two approaches.
- Simply, per iteration, declare a new multivariate normal distribution using the distributions package. i.e. declaring a new variable per iteration.
- Computing the cholesky decomposition of my covariance matrix, and then sample from a standard normal of appropiate dimensions. Regardless, this seems to have much a higher cost than it is worht.
I would really value any insights.
Hi and welcome to the Julia Discourse!
Wouldnβt 1. compute something like a Cholesky decomposition down the hood anyway? Iβd guess 2. can be marginally quicker, but it should be easy to benchmark this (edit: no, see below).
If you really want to cut off a bit more, you can probably copy what Distributions.jl does down the hood and avoid allocating intermediate things, but Iβm not sure if the speed-up would be worth the time.
Looks like using the package is more efficient:
using Distributions, BenchmarkTools, Random, LinearAlgebra
Random.seed!(42)
function with_intermediate(ΞΌΞ£, N, M, d)
sum_out = 0.
for i in 1:M
ΞΌ = rand(d)
Ξ£ = rand(ΞΌΞ£)
X = rand(MvNormal(ΞΌ, Ξ£), N)
sum_out += X[1,1]
end
return sum_out
end
function without_intermediate(ΞΌΞ£, N, M, d)
sum_out = 0.
for i in 1:M
ΞΌ = rand(d)
S = cholesky(rand(ΞΌΞ£)).L
X = muladd(S, randn(d,N), ΞΌ)
sum_out += X[1,1]
end
return sum_out
end
function test(d, N, M)
S = cholesky(I(d))
ΞΌΞ£ = Wishart(d, S)
with_intermediate(ΞΌΞ£, N, M, d)
without_intermediate(ΞΌΞ£, N, M, d)
display(@benchmark with_intermediate($ΞΌΞ£, $N, $M, $d))
display(@benchmark without_intermediate($ΞΌΞ£, $N, $M, $d))
end
test(4, 100, 10)
gives
BenchmarkTools.Trial: 10000 samples with 1 evaluation per sample.
Range (min β¦ max): 14.792 ΞΌs β¦ 53.984 ms β GC (min β¦ max): 0.00% β¦ 99.91%
Time (median): 17.541 ΞΌs β GC (median): 0.00%
Time (mean Β± Ο): 23.199 ΞΌs Β± 539.668 ΞΌs β GC (mean Β± Ο): 23.25% Β± 1.00%
βββ
ββββ
βββββββββββββββββββββββ
βββββββββββββββββββββββββββββββββββββ β
14.8 ΞΌs Histogram: frequency by time 24.6 ΞΌs <
Memory estimate: 40.47 KiB, allocs estimate: 60.
BenchmarkTools.Trial: 10000 samples with 1 evaluation per sample.
Range (min β¦ max): 17.791 ΞΌs β¦ 57.846 ms β GC (min β¦ max): 0.00% β¦ 99.90%
Time (median): 24.458 ΞΌs β GC (median): 0.00%
Time (mean Β± Ο): 33.310 ΞΌs Β± 586.611 ΞΌs β GC (mean Β± Ο): 26.21% Β± 3.13%
ββββ
ββββββ
ββββββ
ββ
ββββββ
ββββββββββββββ
ββββββββββββββββββββββββββββββββ β
17.8 ΞΌs Histogram: frequency by time 37.6 ΞΌs <
Memory estimate: 109.84 KiB, allocs estimate: 90.
Hi thanks for the warm welcome!
It makes total sense. Any idea of why would calling the package is faster? Iβm still a begginer in understanding why things are as fast as they are different programming languages, but I would to learn everything I can.
Thanks for taking the time btw!
Iβve quickly gone through the code. If Iβm not missing anything, the following explains the difference: if you go to the source for Distributions.jl, you find that they use the function unwhiten!
from PDMats.jl (see Distributions.jl/src/multivariate/mvnormal.jl at efff906e2e6aad180d0be6dcaa9c98d3c398510d Β· JuliaStats/Distributions.jl Β· GitHub) to transform a standard MVN to the desired variance-covariance structure. If you then go to PDMats.jl, I believe that it uses unwhiten!
as written here (PDMats.jl/src/pdmat.jl at 7b61f73aa52d53af980ac5b47a93f93bb9fe253e Β· JuliaStats/PDMats.jl Β· GitHub):
function unwhiten!(r::AbstractVecOrMat, a::PDMat, x::AbstractVecOrMat)
@check_argdims axes(r) == axes(x)
@check_argdims a.dim == size(x, 1)
if r === x
return lmul!(chol_lower(cholesky(a)), r)
else
return mul!(r, chol_lower(cholesky(a)), x)
end
end
which suggests that it uses an in-place multiplication based on the cholesky-decomposition. But perhaps someone more versed in Distributions.jl can weigh in on this?
So, if youβre looking for an even faster version; simply copy the code from the _rand!
method and make it use your PD matrix immediately, rather than reading it from a MVN object. You could even strip down unwhiten!
a bit more for your use-case
Nice! I didnt know things could get so deep for efficency.
Thanks for the help!