How can i plot a Correlation HeatMap?

Hello Guys!

I need a help. i want make this heatmap:

heat

I’m trying with that code:

Plots.heatmap(cor(Matrix(df[!,[:Age,:Balance, :Tenure,:NumOfProducts,:EstimatedSalary]]))
)

image

**

But how can i put the annotations of the correlation values and the columns names in the ticks?

**

Another try but using a DataFrame:

a = DataFrame(cor(Matrix(df[!,[:Age,:Balance, :Tenure,:NumOfProducts, :EstimatedSalary]])), [:Age,:Balance, :Tenure,:NumOfProducts, :EstimatedSalary]) 
Age Balance Tenure NumOfProducts EstimatedSalary
Float64 Float64 Float64 Float64 Float64
1 1.0 0.0283084 -0.00999683 -0.0306801 -0.00720104
2 0.0283084 1.0 -0.0122539 -0.30418 0.0127975
3 -0.00999683 -0.0122539 1.0 0.0134438 0.00778383
4 -0.0306801 -0.30418 0.0134438 1.0 0.0142042
5 -0.00720104 0.0127975 0.00778383 0.0142042 1.0
begin
@df a Plots.heatmap(cols([:Age,:Balance, :Tenure,:NumOfProducts,:EstimatedSalary]))
	annotate!([(1, 1, (a[1,1], 8, :red, :center))])
end

image

Here i used annotate to create the correlation value, but i made this manually and have no idea of how to put the column names in the ticks.

Have another clever or easier way to do that?

Thx!

1 Like

One way is by searching discourse, which led me here.

2 Likes

I’m working all day on this and i found a solution, off course isn’t the best.

1 - First i created a dataframe with the correlations:

a = DataFrame(cor(Matrix(df[!,[:Age,:Balance, :Tenure,:NumOfProducts, :EstimatedSalary]])), [:Age,:Balance, :Tenure,:NumOfProducts, :EstimatedSalary]) 
Age Balance Tenure NumOfProducts EstimatedSalary
Float64 Float64 Float64 Float64 Float64
1 1.0 0.0283084 -0.00999683 -0.0306801 -0.00720104
2 0.0283084 1.0 -0.0122539 -0.30418 0.0127975
3 -0.00999683 -0.0122539 1.0 0.0134438 0.00778383
4 -0.0306801 -0.30418 0.0134438 1.0 0.0142042
5 -0.00720104 0.0127975 0.00778383 0.0142042 1.0

2 - Then this for loop

begin
	g = []
	b = 1
	for e in 1:5
		for i in eachrow(a[!,[:Age,:Balance,:Tenure,:NumOfProducts,:EstimatedSalary]])
			if b == 1
				c = (e, e, (string(round(i[1],digits = 3)), 8, :red, :center))
			elseif b == e
				c = (e, 1, (string(round(i[1],digits = 3)), 8, :red, :center))
			else
				c = (e, b, (string(round(i[e],digits = 3)), 8, :red, :center))
			end
			
			if b <= 4 
				b += 1
			else
				b = 1
			end
			push!(g,c)
		end
	end
end

3 - Convert the types of g:

k = Vector{Tuple{Int64, Int64, Tuple{String, Int64, Symbol, Symbol}}}(g)

4 - Finally the Correlation Heatmap.

begin
	xlabel = names(a)
	
	@df a Plots.heatmap(cols([:Age,:Balance, :Tenure,:NumOfProducts,:EstimatedSalary]);
	xticks=(1:5, xlabel),
	yticks=(1:5, xlabel))
	annotate!(k)

end

image

Thx, with your post i implemented the columns names in the ticks

Fyi, plot argument xrotation=90 rotates the xticks as per your model example.

1 Like

Nice! thx for the tip.

Below the final function to plot a corr heatmap:


function corrheatmap(df::AbstractDataFrame ,colnames::Vector{Symbol})
	a = DataFrame(cor(Matrix(df[!,colnames])), colnames)
	
	g = []
	b = 1
	for e in 1:length(colnames)
		for i in eachrow(a[!,colnames])
			if b == 1 && e != 1
				c = (e, e, (string(round(i[1],digits = 3)), 8, :red, :center))
			elseif b == e
				c = (e, 1, (string(round(i[1],digits = 3)), 8, :red, :center))
			else
				c = (e, b, (string(round(i[e],digits = 3)), 8, :red, :center))
			end
			
			if b < length(colnames) 
				b += 1
			else
				b = 1
			end
			push!(g,c)
		end
	end

	k = Vector{Tuple{Int64, Int64, Tuple{String, Int64, Symbol, Symbol}}}(g)

	@df a Plots.heatmap(cols(colnames);
		xticks=(1:length(colnames), colnames),
		yticks=(1:length(colnames), colnames),
		xrotation = 90)
		annotate!(k)
	
end
corrheatmap(df,[:Age,:Balance,:EstimatedSalary])

image

:+1:

Your efforts are commendable, but please note that this task should be less labor intensive. For example we could do:

using DataFrames, Statistics, Plots

# INPUT DATA
df = DataFrame(Age=rand(4), Balance=rand(4), Tenure=rand(4), Salary=rand(4))
cols = [:Age, :Balance, :Tenure]  # define subset
M = cor(Matrix(df[!,cols]))       # correlation matrix

# PLOT
(n,m) = size(M)
heatmap(M, fc=cgrad([:white,:dodgerblue4]), xticks=(1:m,cols), xrot=90, yticks=(1:m,cols), yflip=true)
annotate!([(j, i, text(round(M[i,j],digits=3), 8,"Computer Modern",:black)) for i in 1:n for j in 1:m])

(NB: added yflip=true for standard orientation of matrix display)

16 Likes

Wow! Nice @rafael.guerra !

Your solution its much better and clean, i’ll use that.

Thx for participating!

If somebody arrives here looking for a solution with Makie. Below is an adaptation of @rafael.guerra’s code:

df = DataFrame(Age=rand(4), Balance=rand(4), Tenure=rand(4), Salary=rand(4))
cols = [:Age, :Balance, :Tenure]  # define subset
M = cor(Matrix(df[!,cols]))       # correlation matrix
(n,m) = size(M)
# # Plots.jl
# Plots.heatmap(M, fc=cgrad([:white,:dodgerblue4]), xticks=(1:m,cols), xrot=90, yticks=(1:m,cols), yflip=true)
# Plots.annotate!([(j, i, Plots.text(round(M[i,j],digits=3), 8,"Computer Modern",:black)) for i in 1:n for j in 1:m])
# Makie.jl
fig, axis_hm, plot_hm =
    Makie.heatmap(M;
        colormap = :RdBu, colorrange = (-1,1),
        axis = (xticks=(1:m, String.(cols)),
                yticks=(1:m,String.(cols)),
                yreversed=true,
                xticklabelrotation = π/2));
Makie.Colorbar(fig[1, 2], plot_hm);
[Makie.text!(axis_hm,
    "$(round(M[i,j],digits=3))",
    position = (i,j),
    align = (:center, :center), fontsize=14,
    color = ifelse(abs(M[i,j]) > 0.5, :white, :black))
    for i in 1:n for j in 1:m];
fig

3 Likes

Bump to add PlotlyJS solution in case anyone is interest.

using Statistics
using DataFrames
using Printf
using PlotlyJS

function plot_cormap(df::AbstractDataFrame; colorscale="Viridis", fontcolor="#aabbcc", fontsize=14)
    m = cor(Matrix(df))
    annotations = [
        attr(text=@sprintf("%.2f", m[i,j]), x=names(df)[i], y=names(df)[j], showarrow=false, font=attr(size=fontsize, color=fontcolor)) for i in 1:size(m)[1] for j in 1:size(m)[2]
    ]
    t = heatmap(z=m, x=names(df), y=names(df), colorscale=colorscale)

    layout = Layout(;
        annotations
    )

    return plot(t, layout)
end

The following figure is produced by

df = DataFrame(rand(10, 10), :auto)
plot_cormap(df)