StatsPlots corrplot is not executing

Interesting problem here. Perhaps this would be best written as an issue in the StatsPlots git repo so I’m here to find out

I am running my work in a Jupyter notebook (v4.4), Julia 1.1.1
MacBook Pro (Retina, 13-inch, Early 2015) 2.7 GHz Intel Core i5 8 GB 1867 MHz DDR3

The corrplot function of StatsPlots isn’t working. Not getting an error message but the cell is just executing for an indeterminate amount of time. I thought it may be my variable matrix was too big so I reduced it to 2 variables and it still keeps running.

train = select(t_new, continuous[1:2])

using JuliaDB: ML

sch = ML.schema(train, hints=Dict(
        :is_bad => ML.Categorical,
        )
)

train

Table with 9726 rows, 2 columns:
is_bad emp_length
──────────────────
0 10
0 1
0 4

The table in question is <10000 examples and is a mix of Int64 and Float64 types. The schema is parsed as all continuous and 1 categorical variable. Pretty typical dataset. I’ve used the seaborn correlation plot function in python and it takes about 10 seconds to execute.

Try opening an issue on StatsPlots, that also possibly has a link to the offending DataFrame so it is possible to replicate. Sounds like it could be an efficiency issue. Maybe tag piever in the issue as he’s a dev on both StatsPlots and JuliaDB.

OK. I posted as an issue over here: https://github.com/JuliaPlots/StatsPlots.jl/issues/241