Can be visualized clustering with the heatmap()?

With this code I get the desired heatmap.

using Plots
dataMatrix=rand(40,10)
heatmap(dataMatrix,levels=10,colorbar=false)

But I would like to achieve the image of the heatmap with cluster, as shown in the figure:

How to achieve it?

I try to reproduce the example in R from this site:

Matrix data and Patterns in rows and columns
This has been the progress. Welcome the comments, criticisms and improvements !!:

#https://bookdown.org/rdpeng/exdata/dimension-reduction.html#matrix-data

using Plots

using StatsBase

using Clustering

using Distances

using Distributions

using RCall

RCall.reval("set.seed(12345)")

dataMatrix=rcopy(RCall.reval("matrix(rnorm(400), nrow = 40)"))

heatmap(dataMatrix,levels=10,colorbar=false)

RCall.reval("set.seed(678910)") 

for i in 1:40

 coinFlip=rcopy(RCall.reval("rbinom(1, size = 1, prob = 0.5)"))

 coinFlip==1 ? dataMatrix[i,:] .+= repeat([0,3], inner=5) : ""

end

Hm=heatmap(dataMatrix,levels=10,colorbar=false)

dist_col=pairwise(Euclidean(), dataMatrix', dims=2)

hh=hclust(dist_col,linkage= :complete,branchorder=:r)

dataMatrixOrdered =dataMatrix[hh.order, :]

## Show the row means

Rm=scatter( mean(dataMatrixOrdered,dims=2),40:-1:1, xlab = "Row Mean", ylab = "Row",legend=false)

## Show the column means

Rc=scatter(1:size(dataMatrixOrdered,2), mean(dataMatrixOrdered,dims=1)[1,:], xlab = "Column Mean", ylab = "Column",legend=false)

#

plot(Hm, Rm, Rc,layout=(1,3), size=(750,375))


Only the heatmat graph with the clusters remains.

Haven’t looked at this in detail, but I don’t really understand the need for RCall here? It looks to me like everything up to and including the loop can be replaced by:

using Distributions

dataMatrix = rand(Normal(), 40, 10)

dataMatrix .+= (repeat([0, 3], inner = 5) * rand(Bernoulli(), 40)')'
2 Likes

You can draw the lines as usual using plot(xs, ys) where xs and ys are vectors of x and y coordinates. Use NaNs to break the lines into disjoint pieces.

1 Like

I used RCall to Generate identical random numbers in R and Julia

Could you give me an example? How to get the graph:

I assumed you had a method to calculate the clustering.

Would it be possible to achieve the same results without using RCall?
That julia generates a matrix equal to the one generated with RCall!

julia> dataMatrix = rand(Normal(), 40, 10)
40Ă—10 Array{Float64,2}:
  0.827701   0.720837   -0.394599   0.946286   -0.161712   -0.150407   0.653404  -0.292532    -1.05481    0.431463
 -1.73982    0.592936   -0.746554   0.699979    0.596363   -0.952979  -1.64594   -0.990633     0.78265   -0.708528
 -0.537933  -0.301604   -1.44274   -0.923007    1.24206     0.676487  -0.7046    -1.45653     -0.300138  -0.0181535
  0.427051  -0.87635    -0.423277  -1.41526     0.216725    0.985366  -0.52175   -0.324171    -0.354566  -0.523006
 -0.208893   0.255794   -1.20055    0.0998029   0.513713   -0.762621   0.453225  -0.00781145  -1.10023   -0.0557359
  â‹®                                                         â‹®
 -1.92591    1.48832    -0.41963    0.0665791   0.132265   -0.780594  -0.295308  -0.9693       0.322365  -0.863658
  0.51561    1.0684     -0.257043  -1.33159     0.0171234   1.34249   -1.44628    0.0534954    0.284925  -0.21738
  0.187182  -0.0597649  -0.593115   1.415      -1.91516    -0.243325  -0.315988  -0.0530183   -1.43812    0.0174793
  1.09321    0.879262    0.85225   -1.87279     0.400153   -1.7885     0.481935   0.648314     1.13771    0.0247559

julia> RCall.reval("set.seed(12345)")
RObject{NilSxp}
NULL


julia> dataMatrix=rcopy(RCall.reval("matrix(rnorm(400), nrow = 40)"))
40Ă—10 Array{Float64,2}:
  0.585529   1.12851    0.645383    1.54486    -0.487639  -1.43615   -0.700076   -1.51386    0.380316   -0.375823
  0.709466  -2.38036    1.04314     1.32145     0.303151  -0.62926   -0.567402    0.164281   0.605137   -1.81283
 -0.109303  -1.06027   -0.304369    0.322152   -0.241974   0.243522  -0.261394   -0.870865   1.01967     0.2886
 -0.453497   0.937141   2.47711     1.53096    -0.481734   1.05836   -1.06389     1.59333    0.474943   -0.189623
  0.605887   0.854452   0.971221   -0.42124    -0.991803   0.831349  -0.106369    0.646598  -2.18595     0.0178602
  â‹®                                                        â‹®
 -0.324087   0.826258  -0.0521536   1.69935     0.85086    0.758375   0.0147936  -0.575096   0.350594    0.640739
 -1.66205   -0.81154    0.628861   -0.344299   -0.443568  -0.641736  -0.311739   -1.40636    0.0282577   0.30709
  1.76773    0.476248   2.18        0.0677721  -0.446775   0.627672  -0.956196    2.26786    0.473048   -0.0331294
  0.025801   1.02126   -0.0690173  -0.65057     0.013305   0.24833    0.473414   -0.770854  -0.919155   -1.37475

It depends on what you mean by “same results” - as you say

using Distributions
rand(Normal(), 40, 10)

in Julia is equivalent to

matrix(rnorm(400), nrow = 40))

in that it produces a 40x10 matrix of random numbers, drawn from a standard normal distribution. Of course the numbers won’t exactly be the same, so if you need the exact same numbers in Julia and R for some reason then you probably need either RCall (or JuliaCall from the R side) or some other way of transferring the numbers (e.g. writing out to and reading back in from csv).

2 Likes