Whats the easiest way to create correlation matrices in Julia?

Basically what the topic title says - suppose I have an array/df thats 5 columns (5 variables) by 100 rows of data values.

I got it to do a good job doing covariance matrix output with the GLM package:

lm1 = lm(@formula(Y ~ X1 + X2 + X3 + X4), DataVarArray)

and then calling for vcov:

println("Covariance matrix: ",vcov(lm1))

but cant find a simple way to do a correlation matrix on same df… some pointers would be appreciated.

You mean this?

julia> using DataFrames

julia> df = rand(10, 5);

julia> using Statistics

julia> cor(df)
5×5 Matrix{Float64}:
 1.0        0.480836    0.226642    0.361239    0.192271
 0.480836   1.0         0.0860495   0.841819   -0.113863
 0.226642   0.0860495   1.0        -0.0617229  -0.26865
 0.361239   0.841819   -0.0617229   1.0         0.158675
 0.192271  -0.113863   -0.26865     0.158675    1.0
1 Like

Yes - actually first thing I tried, but kept getting this error:

LoadError: MethodError: no method matching cor(::DataFrame)

so thought I am just not understanding that function… but your toy example works fine for me as well… hmm… wonder why it’s not happy with the matrix I tried passing it… i wonder if it’s because when i use hcat to put it together it adds a Row column?

ahh got it! it really wants a “true” matrix…
when i cast it as:

df1 = Matrix(VarArray)

and then pass it to cor(df1), it finally works as expected!

Ah yes sorry I cheated a bit above which probably added to your confusion. rand(10, 5) creates a matrix, which is what cor requires. A DataFrame isn’t an AbstractMatrix but as you found can be cast as one by just calling the Matrix constructor.

all good, i learned something new today!