Proper use of scaling on axes from ordinations (like MDS and PCA)?

kevbonham · May 25, 2019, 4:46pm

Moving a conversation from slack over to discourse because there’s a bunch of useful stuff that I’d like preserved, but also because we haven’t actually reached a consensus answer.

The original question (by me):

Question about ordinations (came up in Plot recipe for MDS from MultivariateStats.jl by kescobo · Pull Request #229 · JuliaPlots/StatsPlots.jl · GitHub): When plotting an ordination from multidimensional scaling (PCoA), is it important to keep the axes on the same scale? Eg. if the first axis is -1 to 1, and the second is -0.5 to 0.5, should the distance between the farthest points in the first dimension be twice that in the second dimension?

I thought that the units of the axes were arbitrary (since distance need not be metric) so the scaling doesn’t actually matter, but Michael seems to think differently. Also, is it the same or different when doing PCA or other ordinations?

I’ll let @mkborregaard lay out his current position and respond as needed, but other perspectives would be welcomed.

mkborregaard · May 25, 2019, 5:41pm

Yes, I think aspect ratio should be equal, because the units of x and y are the same. I think it gives the most fair image, also of the degree of variation on each axis.
I also believe it is the convention - on Slack I came with a number of references to the use of equal aspect ratio:

This blog post from one of the developers of R’s vegan package (the R package for community ecology) recommends customising ordination plots by setting aspect ratio to 1: Customising vegan's ordination plots

So does the owner of the ggvegan package, R’s package for making ordination plots with ggplot2: Recreating autoplot.cca() from fortify.cca() data frame · Issue #9 · gavinsimpson/ggvegan · GitHub

So does the asker of this SO question: plot - R - how to make PCA biplot more readable - Stack Overflow

Here’s a paper that argues for a different method - scaling pca plots by the eigenvalues on the axes. That must be considered an alternative method. They even write how the peer reviewers remarked on the aspect ratio not being equal https://f1000research.com/articles/5-1492/v2

For MDS, which is distance-preserving, I would also argue that distances are best preserved with aspect ratio 1. I laid out this argument and small toy analysis for it:

When you plot the points, it translates the x and y values into pixel positions on the screen. this is equivalent to multiplying x with a constant a that depends on the pixel concentration, the extent of the x axis in screen pixels and the values associated with the x axis limits, and multiplying y with b that depends on the same attributes of the y axis
The aspect ratio is then defined as c = a/b, so given that all of these are constant the plot coordinates are proportional to multiplying y with c.

Now if we take an example

using StatsPlots, Distances, MultivariateStats
m = rand(100, 10)
dm = pairwise(BrayCurtis(), m)
mds = fit(MDS, dm, distances=true)
proj = projection(mds)
x, y = (proj[:,i] for i in 1:2)

We’ve fit a PCoA to the dm distance matrix of distances among m, and extracted the first two components, x and y, which is the translation into two dimensions that best maintains the distances between points. Now let’s calculate the correlation between the distances on the plot and the distances among the input points. To do that we need to multiply by the aspect ratio, c.

using Statistics
function getcor(c, dm, x, y)
       fin = pairwise(Euclidean(), vcat(x', c .* y'), dims = 2)
       cor(vec(fin), vec(dm))
 end
 res = [getcor(i, dm, x, y) for i in 0.1:0.01:3]
 plot(0.1:0.01:3, res)

Here’s the result

1 looks like a good aspect ratio here. It would be cool to see this for a real-life code case with non-Euclidean distances, as that is what MDS is often used for.

kevbonham · May 27, 2019, 1:53pm

Here’s a real-life example (human microbiomes) trying out your code. I calculated the pairwise Bray Curtis dissimilarity and plotted it as you suggested:

So here the peak is closer to c ≈ 0.6. I’m guessing that the discrepancy is because, in a random matrix, there probably aren’t axes that describe a disproportionate amount of the variation. Looking at a scree plot, for my data:

Compared to one for a random matrix with the same number of dimensions:

mkborregaard · May 27, 2019, 3:24pm

I find this genuinely interesting, as I was expecting the pcoa to maintain distances. I wonder - could this be related to the eigenvalue ratio, as suggested in the F1000 paper? And does anyone else here know anything about this?

kevbonham · May 27, 2019, 4:56pm

I wonder - could this be related to the eigenvalue ratio, as suggested in the F1000 paper?

If it is, it’s not straightforward. Using my real life example, I took some subsets of the data and calculated the aspect ratio with the highest correlation (maxc), what that correlation was (maxcor) and the ratio of eigenvalues. Here’s what I found:

function maxcor(mds, dm; dim1=1, dim2=2, rng = 0.1:0.01:3)
    proj = projection(mds)
    x, y = (proj[:,i] for i in [dim1, dim2])

    e1, e2 = eigvals(mds)[[dim1, dim2]]
    
    res = [getcor(i, dm, x, y) for i in rng]
    (mc, i) = findmax(res)
    return (maxc = rng[i], maxcor=mc, eigratio=e2/e1)
end


function ploteigs(dm; dims=50, points=200)
    c = Float64[]
    er = Float64[]
    cor = Float64[]
    
    for _ in 1:points
        picks = rand(1:size(dm, 1), dims)
        subdm = dm[picks, picks]
        mds = fit(MDS, subdm, distances=true)

        for i in 1:5
            mc = maxcor(mds, subdm, dim1=1, dim2=i)
            push!(c, mc[:maxc])
            push!(er, mc[:eigratio])
            push!(cor, mc[:maxcor])
        end
    end
    
    println(length(c))
    scatter(c, er, zcolor=cor)
end

ploteigs(dm)

kevbonham · May 27, 2019, 5:44pm

Just for funsies, here it is with a random matrix:

mkborregaard · May 27, 2019, 5:56pm

You should set aspect_ratio = 1 in those plots, given that the scales of x and y are supposed to be the same

kevbonham · May 27, 2019, 6:19pm

mkborregaard · May 27, 2019, 6:33pm

hahahaha.
What’s going on with all those where the eigenvalue ratio is 1?

kevbonham · May 27, 2019, 6:52pm

No idea… we really need someone that knows what they’re talking about to join this conversation

mkborregaard · May 27, 2019, 7:08pm

BTW I think it’s a bug that Plots adjusts the image size by the aspect_ratio. It should use aspect_ratio to redefine the x and y lims

kevbonham · June 21, 2019, 2:05pm

One more data point (in your favor @mkborregaard) is here (see Tip 6 and figure 2).

I’m still not convinced though.

kevbonham · June 21, 2019, 3:25pm

This came up in my lab’s slack discussion. Someone found the reason for all of those eigenvalues ratio = 1 points…

        for i in 1:5
            mc = maxcor(mds, subdm, dim1=1, dim2=i)

should have been 2:5

kevbonham · June 21, 2019, 5:11pm

A couple more comments from Siyuan Ma, a grad student in the Huttenhower lab:

Siyuan Ma [11:14 AM]
I think it is not guaranteed the correlation between Euclidean distance on first two axes of PCoA (using aspect ratio 1) vs. original Bray-Curtis need be great; they should be associated with how much total variance could be explained by those two. Here’s trying to replicate Kevin’s experiments in R:

set.seed(1)
R <- 1000
tb_cor <- (1:R) %>% 
  purrr::map_dfr(function(i_rep) {
    rsample <- sample.int(n = nrow(mat_dist), size = 50)
    dist_sub <- mat_dist[rsample, rsample] %>% as.dist()
    pcoa.fit <- ape::pcoa(dist_sub)
    cor.dist <- cor(as.matrix(dist(pcoa.fit$vectors[, 1:2]))[lower.tri(dist_sub)], 
                    as.matrix(dist_sub)[lower.tri(dist_sub)])
    tibble::tibble(i_rep = i_rep,
                   eigen.ratio = pcoa.fit$values$Eigenvalues[2] / pcoa.fit$values$Eigenvalues[1],
                   rel.eigen.sum = sum(pcoa.fit$values$Relative_eig[1:2]),
                   cor.dist = cor.dist)
  })
tb_cor %>% 
  ggplot(aes(x = rel.eigen.sum, y = cor.dist)) +
  geom_point() +
  xlab("Total variance explained by first two PCo") +
  ylab("Correlation with original dissimarity")
tb_cor %>% 
  ggplot(aes(x = eigen.ratio)) +
  geom_histogram()

Pasting the slack thread here:

Kevin Bonham [1 hour ago]
Could it be the case that the units of each axis are in fact arbitrary, but that the eigenvalue ratio is still relevant to % explained?

Eg, could you have an eigenvalue ratio of 99:1, but have the units of PCo.1 go from -1 to 1 and PCo.2 go from -5 to 5? I don’t understand the math well enough to know if this is a nonsense idea

Siyuan Ma [1 hour ago]
think by definition you have sum(PCo_i^2) = eigen_i, so the spread of PCos should be pretty proportional to their eigen values…
17 replies

Kevin Bonham [1 hour ago]
Do you mean the sum of the squares of each element in a given axis?

Siyuan Ma [1 hour ago]
Yes

Kevin Bonham [1 hour ago]
julia> m = rand(100, 10); dm = pairwise(BrayCurtis(), m, dims=2);

julia> mds = fit(MDS, dm, distances=true);

julia> eigvals(mds)
9-element Array{Float64,1}:
 0.09303369745046304
 0.07354269766225335
 0.06840335824030261
 0.06063452267157059
 0.055494272851594154
 0.052857486119026846
 0.04507085890321592
 0.0345790624648994
 0.020259336839508882

julia> proj = projection(mds);

julia> axsums = [sum(proj[:,i].^2) for i in 1:size(proj,2)]
9-element Array{Float64,1}:
 0.9999999999999997
 1.0000000000000002
 1.0
 0.9999999999999996
 0.9999999999999998
 0.9999999999999998
 1.0000000000000002
 0.9999999999999994
 1.0000000000000002
(edited)

Kevin Bonham [1 hour ago]
All of the sums of squares are == 1

Siyuan Ma [1 hour ago]
Huh

Siyuan Ma [1 hour ago]
I tried this in R and they are the same. I’ guessing then “projection” in Julia are probably not the coordinates as plotted in PCoA; rather they are the eigen vectors. So the coordiantes for PCo1 should be projection_1*sqrt(eigen_1)

Siyuan Ma [1 hour ago]
I was reading the documentation for this in Julia, see if comparing transform against projection makes sense?

Kevin Bonham [33 minutes ago]
Err… I thought that the eigen vectors was what’s plotted…

Kevin Bonham [27 minutes ago]
3.191891195797325e-16
Siyuan Ma [24 minutes ago]
Eigen vectors should always have square sum 1 so they don’t reflect the relative scales of PCos… see if you could plot proj[,1] * sqrt(eigen[1]) against transform(mds)[,1]?

Siyuan Ma [22 minutes ago]
Reference to manual:

Siyuan Ma [22 minutes ago]
https://github.com/JuliaStats/MultivariateStats.jl/blob/master/docs/source/cmds.rst

Siyuan Ma [17 minutes ago]
Wikipedia’s MDS page is not great for details Copying this excerpt from ape’s documentation for pcoa:

The eigenvectors are scaled to the square root of the corresponding eigenvalues. Gower (1966) has shown that eigenvectors scaled in that way preserve the original distance (in the D matrix) among the objects. These eigenvectors can be used to plot ordination graphs of the objects.

Kevin Bonham [15 minutes ago]
Weird - the transform object is row-major, but:
julia> scatter(
           proj[:, 1] .* sqrt(eig[1]),
           vec(t[1, :]), legend=false,)

julia> xlabel!("projection"); ylabel!("transform")
image1186×792 56.2 KB

mkborregaard · June 21, 2019, 6:14pm

So do we end up agreeing with the faculty1000 paper that aspect ratio should be equal to the proportion of the variances?

kevbonham · June 21, 2019, 6:23pm

Possibly. To complete the picture - using the rows of the transform rather than the columns of the projection:

using StatsPlots, Distances, MultivariateStats

m = rand(100, 10)
dm = pairwise(BrayCurtis(), m, dims=2)
mds = fit(MDS, dm, distances=true)

tfm = collect(transform(mds)')

x, y = (tfm[:,i] for i in 1:2)

using Statistics

function getcor(c, dm, x, y)
       fin = pairwise(Euclidean(), vcat(x', c .* y'), dims = 2)
       cor(vec(fin), vec(dm))
end

res = [getcor(i, dm, x, y) for i in 0.1:0.01:3]
plot(0.1:0.01:3, res)

but for real data:

And doing my simulation:

function maxcor(mds, dm; dim1=1, dim2=2, rng = 0.1:0.01:3)
    tfm = collect(transform(mds)')
    x, y = (tfm[:,i] for i in [dim1, dim2])

    e1, e2 = eigvals(mds)[[dim1, dim2]]
    
    res = [getcor(i, dm, x, y) for i in rng]
    (mc, i) = findmax(res)
    return (maxc = rng[i], maxcor=mc, eigratio=e2/e1)
end


function ploteigs(dm; dims=50, points=200)
    c = Float64[]
    er = Float64[]
    cor = Float64[]
    
    for _ in 1:points
        picks = rand(1:size(dm, 1), dims)
        subdm = dm[picks, picks]
        mds = fit(MDS, subdm, distances=true)

        for i in 2:5
            mc = maxcor(mds, subdm, dim1=1, dim2=i)
            push!(c, mc[:maxc])
            push!(er, mc[:eigratio])
            push!(cor, mc[:maxcor])
        end
    end
    
    println(length(c))
    scatter(c, er, zcolor=cor, aspect_ratio=1)
end

ploteigs(dm)

ucfagls · February 8, 2021, 3:51pm

Continuing the discussion from Proper use of scaling on axes from ordinations (like MDS and PCA)?:

Coming to this late, but I noticed hits from this page on my blog and was curious.

I think we have to be a little careful when speaking in generalities especially when comparing domains of usage of these methods. Vegan has more of a focus on PCA/RDA, CA/CCA, and NMDS rather than PCoA (principal coordinates analysis, PCO, classic MDS) and with those methods we generally prepare the plots with aspect ratio 1. This is not to say that we don’t rescale the axis scores (the eigen vectors); we do scale these but we still plot on equally scaled axes.

I’m intrigued to follow up on the advice from Gower; from what I cribbed above, though, this is a transformation of the data, the eigen vectors, prior to plotting. I’d be very surprised if plotting was done with an aspect ratio other than 1, if the interpretation relies on a distance interpretation on the plot as an approximation of the true distance.

That said, it’s not unheard of to use well-justified aspect ratios that differ from 1; take coinertia analysis and co-correspondence analysis, which compare two matrices of multivariate data, where adjusting the aspect ratio to be the quarter root of the ratio of the Eigenvalues of the axes being displayed is the optimal way to display the covariances between the two matrices. This produces a very-specific plot from the broader world of ordination, and which has some specific properties, but it’s been so long since I looked at these things I can’t recall the details currently.

Topic		Replies	Views
X and y axis equal scaling Visualization plots	2	6408	August 11, 2022
Setting `axis equal` General Usage plots	1	897	September 26, 2022
Proper use of MDS and PCA Statistics	4	337	November 14, 2022
Using same x- and y-axis scales in Makie plots Visualization makie	3	2256	March 29, 2021
The Scaling and centering Matrices New to Julia package , function	6	3095	November 16, 2020

Proper use of scaling on axes from ordinations (like MDS and PCA)?

Related topics