Is there a quicker way to obtain normalized multivariate random normal draws?

schwob · May 17, 2024, 10:32pm

I want to make n random draws from a Multivariate Normal distribution, then divide each element in each of the random draws by the norm of the random draw. Mathematically, I want to obtain \mathbf{y}_i=\mathbf{x}_i/||\mathbf{x}_i|| for i=1,\dots,n, where \mathbf{x}_i\sim\text{N}(\boldsymbol{\mu},\mathbf{V}).

Here is my current code:

X = rand(MvNormal(mu, Hermitian(V)), n)'
Y = X ./ sqrt.(sum(X.^2, dims=2))

This is fast, but I’m hoping to make this even quicker. I am using this code in an iterative algorithm, where this is called hundreds of times per iteration, so any computational savings will add up and help! However, I’m doubtful that it can be made quicker as I have yet to find a way to do so. This discourse is my last hope!

Thank you for your time.

stevengj · May 17, 2024, 10:40pm

This allocates 3 arrays — one for X, another for X.^2, and another for Y. (I’m also not sure if there are any computations happening inside MvNormal that you can precompute if you are re-using mu and V in multiple calls?) You should be able to do better.

Can you post runnable code with typical example values of mu, V, and n? The best solution could depend on these, and posting runnable examples is the best practice anyway if you want to ask for help.

Dan · May 17, 2024, 10:47pm

You may be looking for something along these lines:

julia> n = 10
julia> V = rand(n, n);
julia> D = MvNormal(zeros(n), Hermitian(V*V'));
julia> X = normalize!(rand(D))
10-element Vector{Float64}:
  0.4164718118357489
 -0.13384293852202814
  0.6469353808584453
 -0.16738703557848347
 -0.11171640120444568
  0.11685513640811303
  0.40181104360935266
  0.3343677550039351
 -0.10421558304760534
  0.22769061412809866

schwob · May 17, 2024, 10:48pm

Thank you for your response! Unfortunately, the mu and V will be different every time I try to do these computations. Thank you for asking for sample code; it makes sense that this would help here.

using Random, Distributions, LinearAlgebra
n = 1000 # will not change
mu = 2*ones(3) # will change every call
V = [0.74 -0.08 0.34; -0.08 1.57 -0.5; 0.34 -0.5 1.16] # will change every call
X = rand(MvNormal(mu, Hermitian(V)), n)'
Y = X ./ sqrt.(sum(X.^2, dims=2))

I hope this helps!

Dan · May 17, 2024, 10:52pm

BTW note that most of the time is spent in the random sampling.

I see, you want n normalized samples:

julia> foreach(normalize!, eachrow(X));

might be it - X now holds normalized rows (no need for Y which is an extra copy).

To get another copy:

julia> Y = stack(map(normalize, eachrow(X)); dims=1);

might be good.

schwob · May 17, 2024, 11:01pm

Thank you for your suggested code, Dan. The foreach() approach speeds up the normalization line (see attached photo). Do you believe there is a quicker way to sample from the multivariate normal? I don’t think there is; otherwise, it would be the currently implemented way, but I figured I may as well ask!

Dan · May 17, 2024, 11:05pm

The way MvNormals are sampled is by sampling standard MvNormal and then colouring them with the mean and variance. The sampling might be faster by sampling standard normals for all matrix entries and then finding a way to generate the appropriate matrices for colouring them faster (this is a slow matrix decomposition IIRC).

If the variance generation process is somehow beneficial to optimization (depending on the exact process), this might help.

And a sidenote: When using @benchmark it is better to quote (use $X) for global variables. This sometimes matters.

stevengj · May 17, 2024, 11:10pm

Note that if you are doing this over and over you might want to pre-allocate X and use rand! as well.

However, there’s also the question of the overall structure of the code. If you are working with lots of 3-component and 3x3 arrays, you might want to use StaticArrays.jl, and you might get better performance if you transition away from a Matlab/Numpy “vectorized” style to a more loopy style.

For example, a typical “vectorized” code might look like:

generate 1000 multivariate normal vectors
normalize them
loop over them and do some computation
loop over the results of that computation and do some other computation
etc.

which leaves a lot of performance on the table because you are looping over and over again over the same array (or worse, generating new arrays with each pass). Better to do (in an allocation-free way, e.g. with SVector):

generate 1 multivariate normal vector
normalize it
do some computation with it
do some computation with the result
and then repeat this 1000 times, i.e. put the loop for i = 1:n outside these steps. If it’s done right, this will often be faster.

stevengj · May 17, 2024, 11:16pm

Yes, there should be, if you always know you have 3x3 covariance matrices, then you can use StaticArrays.

schwob · May 17, 2024, 11:27pm

Thank you for your responses. I’ll look through StaticArrays this weekend and keep you updated in this thread. I appreciate your help thus far!

stevengj · May 18, 2024, 2:40am

For example, let’s compare your method:

function doit(mu, V, n)
    X = rand(MvNormal(mu, Hermitian(V)), n)
    foreach(normalize!, eachcol(X))
    return X
end

to a version that expects static arrays, and returns an array of 3-component SVectors instead of a matrix:

using StaticArrays
function doit(mu::SVector, V::SMatrix, n)
    L = cholesky(V).U'
    X = [normalize(mu + L * @SVector(randn(3))) for _=1:n]
    return X
end

On my machine, I find that

mu_s = SA[2., 2., 2.]
V_s = SA[0.74 -0.08 0.34; -0.08 1.57 -0.5; 0.34 -0.5 1.16]
@btime doit($mu_s, $V_s, $n);

is almost twice as fast as @btime doit($mu, $V, $n); and if you pre-allocate X then you gain another 10%.

(And, as I said above, it may be even better if you merge this loop with any additional processing you want to do on these random vectors, rather than the traditional “vectorized” style of doing many individual passes over the data with built-in “vectorized” operations, which is only necessary in languages where loops are slow.)

Note that this implementation is essentially pulling out the internals of the MvNormal algorithm: if V has the Cholesky factorization V = L L^*, and r = randn(3), then x = \mu + L r has mean \mu and covariance V.

schwob · May 18, 2024, 3:41am

Thank you for providing this example. My d will not always be 3, but the suggested code still seems quicker even with larger d. I found that

function doit(mu::SVector, V::SMatrix, n)
    L = cholesky(V).U'
    X = [normalize(mu + L * @SVector(randn(d))) for _=1:n]
    return X
end

(replacing the 3 in randn with a variable) significantly slowed things down. Is there a way around this, or is that part of the JIT nature of Julia? If so, that’s ok. I’ll just need to manually change the d values if d changes between experiments.

These static arrays seem pretty fast! I’ll try to implement them elsewhere in my algorithm. Thank you for your guidance!

Also, when you mention the pre-allocation of X, I’m not sure I know what you mean. Does this mean that X is defined before the function call, and we just modify it within the function?

stevengj · May 18, 2024, 12:59pm

The key to exploiting static arrays is that the length must be known at compile time.

One way to do this but still allow it to vary at a high level (not inside tight loops!) is to pass the length as part of a type. One way is to pass it via a Val argument, but it’s even better (more elegant and reliable) to get it from another SVector argument, in this case mu. Unfortunately I’m typing on my phone now so I can’t show you an example, but I’ll post something later if no one else beats me.

PS. Also, since your d is apparently a global variable, this is particularly slow.

Dan · May 18, 2024, 1:21pm

Replacing d with length(mu) is enough. Since mu is an SVector, its length is known at compile time and so it is essentially equivalent to hardcoding the constant. Also the d causing a slowdown could probably be attributes to it being a global variable, which really annoys Julia (this is mentioned in all Julia performance tips).

schwob · May 18, 2024, 3:32pm

Thank you for the tips! Dan suggested length(mu), which works quite well

schwob · May 18, 2024, 3:33pm

Thank you for your help, Dan! I greatly appreciate your time.

Topic		Replies	Views
How to sample from MvNormal without allocating? Statistics question	5	594	August 15, 2023
Fastest way to sample from MVN, changing parameters Julia at Scale statistics , distributions , gaussian-process , probablistic	7	111	May 3, 2025
Creating random Static Vector by StaticVector.jl and Distributions.jl in a loop New to Julia loops , staticarrays , random	4	699	March 27, 2022
Is this normal things about MvNormal? General Usage question , distributions	5	87	September 3, 2024
Need help implementing non-allocating normal distribution with StaticArrays Performance memory-allocation	2	281	January 13, 2023

Is there a quicker way to obtain normalized multivariate random normal draws?

Related topics