Weighted covariance, how to compute?


#1

If d is data set and k,l=size(d)
cov = d’*d/k-mean(d,1)’*mean(d,1)

It is nice formula ,

But how to compute the cov if any row is weighted by vector w ?
Paul


#2

https://juliastats.github.io/StatsBase.jl/latest/weights.html
https://juliastats.github.io/StatsBase.jl/latest/cov.html


#3

Thx, but i need algebraic formula (to large matrix for this way). Somebody can help ?
Paul


#4

The methods above do not have a lot of overhead, but if you need online methods,


can also do covariance matrices very efficiently.


#5

Thanks, but I have really huge matrix, only algebraic way is ok for it !
Somebody can help ?


#6

If you can share how the methods mentioned by @Tamas_Papp failed and what else you have tried you are more likely to get a concrete answer. How big is your really huge matrix?


#7

y=rand(1000,15000)
k,l=size(y)
o = CovMatrix(l) # fue minutes …
Series(y,o) # nothing doing

Julia 6.0
win7, 8 core in machine, Ram =8G
y is only sample
my matrix is sparse size : 10^7 x 30*10^4

I am looking for algebraic formula because i can use pmap
Paul


#8

Are those numbers right? Even if your data is sparse, the covariance matrix will be dense, and require 670GB to store.

julia> 300_000^2 * 8 / 1024 ^3 
670.5522537231445

#9

I know, and Im looking for this algebraik formula .
Paul

W dniu 2017-11-24 o 22:23, Josh Day pisze:


#10

Why is this a Julia question then? In any case, see Wikipedia.

The libraries I linked above implement two-pass and online algorithms for this statistic. I imagine you would be much better off using them than rolling your own.