Statistics on random matrices

jamblejoe · June 5, 2020, 12:03am

Hi,
I am generating a lot of random matrices and want to study their statistical properties. Right now I am storing them in a 3-dimensional array, where the first two dimensions are the matrix sizes, let’s say square matrices of size D, and the third dimension is the number of realizations of random matrices. E.g.

D = 10
realizations = 20
matrices(D,D,realizations)

holds 20 10x10 matrices. Now I want to calculate some statistics for each entry of the random matrices, say the mean of each entry or the variance of each entry. I did this in two ways. First, by explicitly writing out the formulas for mean and variance and then using them for each entry, e.g.

mu = zeros(D,D)
for i in 1:realizations
  mu .+= matrices[:,:,i]
end
mu ./= realizations

Second, converting the D x D x realizations array into a DxD matrix of vectors of length realizations. And then using the functions mean and var from Statistics.

matrices_2(D,D)
mean.(matrices)

The array conversion hit performance really hard, because I did it naively in a loop.

How would achieve my goal, calculate some statistics for each entry of the random matrices, by using functions from Statistics or any other package, and not have to allocate a new array?

EDIT: The answer for mean and var is the marked answer, while for higher moment estimations (and general estimations) see the post of @nilshg Statistics on random matrices - #7 by nilshg.

johnczito · June 5, 2020, 12:16am

Functions like mean, var, std, and median take an optional keyword argument for computing things elementwise along a given dimension of a multidimensional array.

using Statistics

X = rand(2, 2, 10)

mean(X, dims = 3)
var(X, dims = 3)
std(X, dims = 3)
median(X, dims = 3)

jamblejoe · June 5, 2020, 12:22am

@johnczito Thanks for the quick answer! I somehow missed that functionality by going through the documentation.

The result is a 2×2×1 Array. Is there a more elegant way to reduce the dimensions by 1 and get a 2x2 Array, than calling

mean(X, dims = 3)[:,:,1]

?

johnczito · June 5, 2020, 12:27am

The dropdims function will do the trick.

using Statistics

X = rand(2, 2, 10)

julia> M = mean(X, dims = 3)
2×2×1 Array{Float64,3}:
[:, :, 1] =
 0.500996  0.491739
 0.545284  0.582671

julia> dropdims(M, dims = 3)
2×2 Array{Float64,2}:
 0.500996  0.491739
 0.545284  0.582671

jamblejoe · June 5, 2020, 12:28am

@johnczito Perfect, thanks!

jamblejoe · June 8, 2020, 1:47pm

@johnczito Sorry to disturb you again. I looked into StatsBase and the functions to calculate skewness and kurtosis seem to miss the above mentioned functionality! Is there a workaround?

nilshg · June 8, 2020, 3:46pm

You can generally apply a function to different slices of a higher dimensional array using mapslices, see e.g. my answer here: Julia: Get range (minimum / maximum values) of a multidimensional array along specific axes - Stack Overflow

Topic		Replies	Views
Base.Statistics -- I'm confused by the differences in arguments General Usage	13	798	August 14, 2020
Drop array dimension with Statistics.mean etc General Usage array	3	1837	September 14, 2019
Understending `dims` kwarg General Usage question	1	247	May 15, 2022
Generic way to apply function to a dimension of a matrix New to Julia array	6	4353	February 1, 2021
The mean of 3-d EmsembleSolution from StochasticDelayDiffEq New to Julia diffeq	6	1043	August 29, 2021

Statistics on random matrices

Related topics