With this code I get the desired heatmap.
using Plots
dataMatrix=rand(40,10)
heatmap(dataMatrix,levels=10,colorbar=false)
But I would like to achieve the image of the heatmap with cluster, as shown in the figure:
How to achieve it?
With this code I get the desired heatmap.
using Plots
dataMatrix=rand(40,10)
heatmap(dataMatrix,levels=10,colorbar=false)
But I would like to achieve the image of the heatmap with cluster, as shown in the figure:
How to achieve it?
I try to reproduce the example in R from this site:
Matrix data and Patterns in rows and columns
This has been the progress. Welcome the comments, criticisms and improvements !!:
#https://bookdown.org/rdpeng/exdata/dimension-reduction.html#matrix-data
using Plots
using StatsBase
using Clustering
using Distances
using Distributions
using RCall
RCall.reval("set.seed(12345)")
dataMatrix=rcopy(RCall.reval("matrix(rnorm(400), nrow = 40)"))
heatmap(dataMatrix,levels=10,colorbar=false)
RCall.reval("set.seed(678910)")
for i in 1:40
coinFlip=rcopy(RCall.reval("rbinom(1, size = 1, prob = 0.5)"))
coinFlip==1 ? dataMatrix[i,:] .+= repeat([0,3], inner=5) : ""
end
Hm=heatmap(dataMatrix,levels=10,colorbar=false)
dist_col=pairwise(Euclidean(), dataMatrix', dims=2)
hh=hclust(dist_col,linkage= :complete,branchorder=:r)
dataMatrixOrdered =dataMatrix[hh.order, :]
## Show the row means
Rm=scatter( mean(dataMatrixOrdered,dims=2),40:-1:1, xlab = "Row Mean", ylab = "Row",legend=false)
## Show the column means
Rc=scatter(1:size(dataMatrixOrdered,2), mean(dataMatrixOrdered,dims=1)[1,:], xlab = "Column Mean", ylab = "Column",legend=false)
#
plot(Hm, Rm, Rc,layout=(1,3), size=(750,375))
Haven’t looked at this in detail, but I don’t really understand the need for RCall here? It looks to me like everything up to and including the loop can be replaced by:
using Distributions
dataMatrix = rand(Normal(), 40, 10)
dataMatrix .+= (repeat([0, 3], inner = 5) * rand(Bernoulli(), 40)')'
You can draw the lines as usual using plot(xs, ys)
where xs
and ys
are vectors of x and y coordinates. Use NaN
s to break the lines into disjoint pieces.
I used RCall to Generate identical random numbers in R and Julia
I assumed you had a method to calculate the clustering.
Would it be possible to achieve the same results without using RCall?
That julia generates a matrix equal to the one generated with RCall!
julia> dataMatrix = rand(Normal(), 40, 10)
40Ă—10 Array{Float64,2}:
0.827701 0.720837 -0.394599 0.946286 -0.161712 -0.150407 0.653404 -0.292532 -1.05481 0.431463
-1.73982 0.592936 -0.746554 0.699979 0.596363 -0.952979 -1.64594 -0.990633 0.78265 -0.708528
-0.537933 -0.301604 -1.44274 -0.923007 1.24206 0.676487 -0.7046 -1.45653 -0.300138 -0.0181535
0.427051 -0.87635 -0.423277 -1.41526 0.216725 0.985366 -0.52175 -0.324171 -0.354566 -0.523006
-0.208893 0.255794 -1.20055 0.0998029 0.513713 -0.762621 0.453225 -0.00781145 -1.10023 -0.0557359
â‹® â‹®
-1.92591 1.48832 -0.41963 0.0665791 0.132265 -0.780594 -0.295308 -0.9693 0.322365 -0.863658
0.51561 1.0684 -0.257043 -1.33159 0.0171234 1.34249 -1.44628 0.0534954 0.284925 -0.21738
0.187182 -0.0597649 -0.593115 1.415 -1.91516 -0.243325 -0.315988 -0.0530183 -1.43812 0.0174793
1.09321 0.879262 0.85225 -1.87279 0.400153 -1.7885 0.481935 0.648314 1.13771 0.0247559
julia> RCall.reval("set.seed(12345)")
RObject{NilSxp}
NULL
julia> dataMatrix=rcopy(RCall.reval("matrix(rnorm(400), nrow = 40)"))
40Ă—10 Array{Float64,2}:
0.585529 1.12851 0.645383 1.54486 -0.487639 -1.43615 -0.700076 -1.51386 0.380316 -0.375823
0.709466 -2.38036 1.04314 1.32145 0.303151 -0.62926 -0.567402 0.164281 0.605137 -1.81283
-0.109303 -1.06027 -0.304369 0.322152 -0.241974 0.243522 -0.261394 -0.870865 1.01967 0.2886
-0.453497 0.937141 2.47711 1.53096 -0.481734 1.05836 -1.06389 1.59333 0.474943 -0.189623
0.605887 0.854452 0.971221 -0.42124 -0.991803 0.831349 -0.106369 0.646598 -2.18595 0.0178602
â‹® â‹®
-0.324087 0.826258 -0.0521536 1.69935 0.85086 0.758375 0.0147936 -0.575096 0.350594 0.640739
-1.66205 -0.81154 0.628861 -0.344299 -0.443568 -0.641736 -0.311739 -1.40636 0.0282577 0.30709
1.76773 0.476248 2.18 0.0677721 -0.446775 0.627672 -0.956196 2.26786 0.473048 -0.0331294
0.025801 1.02126 -0.0690173 -0.65057 0.013305 0.24833 0.473414 -0.770854 -0.919155 -1.37475
It depends on what you mean by “same results” - as you say
using Distributions
rand(Normal(), 40, 10)
in Julia is equivalent to
matrix(rnorm(400), nrow = 40))
in that it produces a 40x10 matrix of random numbers, drawn from a standard normal distribution. Of course the numbers won’t exactly be the same, so if you need the exact same numbers in Julia and R for some reason then you probably need either RCall (or JuliaCall from the R side) or some other way of transferring the numbers (e.g. writing out to and reading back in from csv).