I created a DataFrame from a CSV simply doing df = DataFrame(CSV.File(“input.csv”))

Using describe(df) I noticed that I have both Int variables and String variables with eltype String7, String3 and so on.

To calculate the correlation between int variables I simply do

```
num_var_names = names(df, Int64)[2:end]
num_vars = train[!, num_var_names]
cor_matrix = cor(Matrix(num_vars))
```

I want also to calculate the correlation between the categorical variables (identified as StringX).

My first attempt was to simply do something like

`cor(Matrix(df[!, ["var1", "var2"]]))`

but I get

MethodError: no method matching /(::InlineStrings.String7, ::Int64)

If I print out df[!, “var1”] I get

`PooledArrays.PooledVector{InlineStrings.String7, UInt32, Vector{UInt32}}: [values...]`

I also try to convert to categorical doing

`cor(CategoricalArray(train[!, "MSZoning"]), CategoricalArray(train[!, "MSZoning"]))`

but i still get

MethodError: no method matching /(::CategoricalArrays.CategoricalValue{InlineStrings.String7, UInt32}, ::Int64)

So, is there a way to calculate the correlation between categorical variables? It would be good if I can do something as simple as I did for the numerical variables.

Thanks.