seems like StatsBase.cov2cor()
is not doing correctly. First, it takes 2 arguments: the covariance matrix and a “a vector of standard deviations”, which should NOT be required. (noted that in cor2cov()
we do need both of these two arguments).
now, doing some experiments, seems like cov2cor()
could NOT recover the correlation matrix:
using StatsBase
s = [2.0, 3.0]
corr = [1.0 0.2; -0.5 1.0]
covv = transpose(s .* transpose(corr) ) .* s
julia> covv
2×2 Array{Float64,2}:
4.0 1.2
-3.0 9.0
julia> corr
2×2 Array{Float64,2}:
1.0 0.2
-0.5 1.0
julia> StatsBase.cov2cor(corr, s)
2×2 Array{Float64,2}:
1.0 -0.0833333
-0.0833333 1.0
julia> StatsBase.cov2cor(corr, s * 0.5)
2×2 Array{Float64,2}:
1.0 -0.333333
-0.333333 1.0
julia> StatsBase.cov2cor(corr, [1.0, 1.0])
2×2 Array{Float64,2}:
1.0 -0.5
-0.5 1.0
^^^ all correlation matrices obtained by StatsBase.cov2cor()
are incorrect. Noted that the results are always symmetric: which is wrong.
R
does give the correct correlation matrix. It takes one argument only:
> cov2cor(matrix(c(4, -3, 1.2, 9), nr = 2))
[,1] [,2]
[1,] 1.0 0.2
[2,] -0.5 1.0
Moreover, going into the source code, while covcor()
is implemented inside cov.jl
under StatsBase
, it actually calls covcor!()
that is implemented inside Statistics.jl
under Statistics
. It’s kind of strange, isn’t it?
Finally, I don’t see the implementation of covcor!()
is correct …