Weighted covariance, how to compute?

If d is data set and k,l=size(d)
cov = dā€™*d/k-mean(d,1)'*mean(d,1)

It is nice formula ,

But how to compute the cov if any row is weighted by vector w ?
Paul

https://juliastats.github.io/StatsBase.jl/latest/weights.html
https://juliastats.github.io/StatsBase.jl/latest/cov.html

1 Like

Thx, but i need algebraic formula (to large matrix for this way). Somebody can help ?
Paul

The methods above do not have a lot of overhead, but if you need online methods,

can also do covariance matrices very efficiently.

1 Like

Thanks, but I have really huge matrix, only algebraic way is ok for it !
Somebody can help ?

If you can share how the methods mentioned by @Tamas_Papp failed and what else you have tried you are more likely to get a concrete answer. How big is your really huge matrix?

2 Likes

y=rand(1000,15000)
k,l=size(y)
o = CovMatrix(l) # fue minutes ā€¦
Series(y,o) # nothing doing

Julia 6.0
win7, 8 core in machine, Ram =8G
y is only sample
my matrix is sparse size : 10^7 x 30*10^4

I am looking for algebraic formula because i can use pmap
Paul

Are those numbers right? Even if your data is sparse, the covariance matrix will be dense, and require 670GB to store.

julia> 300_000^2 * 8 / 1024 ^3 
670.5522537231445
3 Likes

I know, and Im looking for this algebraik formula .
Paul

W dniu 2017-11-24 o 22:23, Josh Day pisze:

Why is this a Julia question then? In any case, see Wikipedia.

The libraries I linked above implement two-pass and online algorithms for this statistic. I imagine you would be much better off using them than rolling your own.