Interesting problem here. Perhaps this would be best written as an issue in the StatsPlots git repo so I’m here to find out
I am running my work in a Jupyter notebook (v4.4), Julia 1.1.1
MacBook Pro (Retina, 13-inch, Early 2015) 2.7 GHz Intel Core i5 8 GB 1867 MHz DDR3
The corrplot function of StatsPlots isn’t working. Not getting an error message but the cell is just executing for an indeterminate amount of time. I thought it may be my variable matrix was too big so I reduced it to 2 variables and it still keeps running.
train = select(t_new, continuous[1:2])
using JuliaDB: ML
sch = ML.schema(train, hints=Dict(
:is_bad => ML.Categorical,
)
)
train
Table with 9726 rows, 2 columns:
is_bad emp_length
──────────────────
0 10
0 1
0 4
⋮
The table in question is <10000 examples and is a mix of Int64 and Float64 types. The schema is parsed as all continuous and 1 categorical variable. Pretty typical dataset. I’ve used the seaborn correlation plot function in python and it takes about 10 seconds to execute.