If d is data set and k,l=size(d)

cov = dā*d/k-mean(d,1)'*mean(d,1)

It is nice formula ,

But how to compute the cov if any row is weighted by vector w ?

Paul

If d is data set and k,l=size(d)

cov = dā*d/k-mean(d,1)'*mean(d,1)

It is nice formula ,

But how to compute the cov if any row is weighted by vector w ?

Paul

https://juliastats.github.io/StatsBase.jl/latest/weights.html

https://juliastats.github.io/StatsBase.jl/latest/cov.html

1 Like

Thx, but i need algebraic formula (to large matrix for this way). Somebody can help ?

Paul

The methods above do not have a lot of overhead, but if you need online methods,

can also do covariance matrices very efficiently.

1 Like

Thanks, but I have really huge matrix, only algebraic way is ok for it !

Somebody can help ?

If you can share how the methods mentioned by @Tamas_Papp failed and what else you have tried you are more likely to get a concrete answer. How big is your *really huge matrix*?

2 Likes

y=rand(1000,15000)

k,l=size(y)

o = CovMatrix(l) # fue minutes ā¦

Series(y,o) # nothing doing

Julia 6.0

win7, 8 core in machine, Ram =8G

y is only sample

my matrix is sparse size : 10^7 x 30*10^4

I am looking for algebraic formula because i can use pmap

Paul

Are those numbers right? Even if your data is sparse, the covariance matrix will be dense, and require 670GB to store.

```
julia> 300_000^2 * 8 / 1024 ^3
670.5522537231445
```

3 Likes

I know, and Im looking for this algebraik formula .

Paul

W dniu 2017-11-24 o 22:23, Josh Day pisze:

Why is this a Julia question then? In any case, see Wikipedia.

The libraries I linked above implement two-pass and online algorithms for this statistic. I imagine you would be much better off using them than rolling your own.