How to evaluate multivariate log normal distribution at arbitrary point?

Suppose I have the following distribution over two variables

dist = MvLogNormal(mean_a_b, cov_a_b)

Suppose h is an array of dimension 1000\times 2. For each row of array h, the first entry represents the rank of the first variable, and the second entry is the rank of the second variable. Is there a way to evaluate the joint distribution dist for each of the 1000 points in h?

1 Like

The natural thing you can do is use logpdf in a for loop over each row of h. In Julia, this will be essentially as fast as possible. But if for some reason you need vectorization (e.g. to make autodiff faster), you can leverage the fact that Distributions.jl knows a matrix can be a sequence of vector observations. The thing to remember is that each observation needs to be a matrix column, hence the transpose:

julia> using Distributions, LinearAlgebra

julia> dist = MvLogNormal(zeros(2), I);

julia> h = rand(1000, 2);

julia> logpdf(dist, transpose(h))
1000-element Vector{Float64}:

If Distributions.jl were less clever, you could also vectorize it manually by wrapping the first argument in a Ref, which tells Julia to only broadcast on the second element:

julia> logpdf.(Ref(dist), eachrow(h))
1000-element Vector{Float64}:

If I understood correctly, the h matrix contains ranks of the components in the column of 1000 pairs. This makes the overall distribution dependent across all pairs.
If it was one column, sampling could be done by sampling, sorting and permuting. If it where several columns and diagonal covariance matrix, the problem again could be split into one column versions. The way it is presented might require some MCMC sampling.