Proper use of scaling on axes from ordinations (like MDS and PCA)?

Moving a conversation from slack over to discourse because there’s a bunch of useful stuff that I’d like preserved, but also because we haven’t actually reached a consensus answer.

The original question (by me):

Question about ordinations (came up in Plot recipe for MDS from MultivariateStats.jl by kescobo · Pull Request #229 · JuliaPlots/StatsPlots.jl · GitHub): When plotting an ordination from multidimensional scaling (PCoA), is it important to keep the axes on the same scale? Eg. if the first axis is -1 to 1, and the second is -0.5 to 0.5, should the distance between the farthest points in the first dimension be twice that in the second dimension?

I thought that the units of the axes were arbitrary (since distance need not be metric) so the scaling doesn’t actually matter, but Michael seems to think differently. Also, is it the same or different when doing PCA or other ordinations?

I’ll let @mkborregaard lay out his current position and respond as needed, but other perspectives would be welcomed.

Yes, I think aspect ratio should be equal, because the units of x and y are the same. I think it gives the most fair image, also of the degree of variation on each axis.
I also believe it is the convention - on Slack I came with a number of references to the use of equal aspect ratio:

This blog post from one of the developers of R’s vegan package (the R package for community ecology) recommends customising ordination plots by setting aspect ratio to 1: Customising vegan's ordination plots

So does the owner of the ggvegan package, R’s package for making ordination plots with ggplot2: Recreating autoplot.cca() from fortify.cca() data frame · Issue #9 · gavinsimpson/ggvegan · GitHub

So does the asker of this SO question: plot - R - how to make PCA biplot more readable - Stack Overflow

Here’s a paper that argues for a different method - scaling pca plots by the eigenvalues on the axes. That must be considered an alternative method. They even write how the peer reviewers remarked on the aspect ratio not being equal https://f1000research.com/articles/5-1492/v2

For MDS, which is distance-preserving, I would also argue that distances are best preserved with aspect ratio 1. I laid out this argument and small toy analysis for it:

When you plot the points, it translates the x and y values into pixel positions on the screen. this is equivalent to multiplying x with a constant a that depends on the pixel concentration, the extent of the x axis in screen pixels and the values associated with the x axis limits, and multiplying y with b that depends on the same attributes of the y axis
The aspect ratio is then defined as c = a/b, so given that all of these are constant the plot coordinates are proportional to multiplying y with c.

Now if we take an example

using StatsPlots, Distances, MultivariateStats
m = rand(100, 10)
dm = pairwise(BrayCurtis(), m)
mds = fit(MDS, dm, distances=true)
proj = projection(mds)
x, y = (proj[:,i] for i in 1:2)

We’ve fit a PCoA to the dm distance matrix of distances among m, and extracted the first two components, x and y, which is the translation into two dimensions that best maintains the distances between points. Now let’s calculate the correlation between the distances on the plot and the distances among the input points. To do that we need to multiply by the aspect ratio, c.

using Statistics
function getcor(c, dm, x, y)
       fin = pairwise(Euclidean(), vcat(x', c .* y'), dims = 2)
       cor(vec(fin), vec(dm))
 end
 res = [getcor(i, dm, x, y) for i in 0.1:0.01:3]
 plot(0.1:0.01:3, res)

Here’s the result

1 looks like a good aspect ratio here. It would be cool to see this for a real-life code case with non-Euclidean distances, as that is what MDS is often used for.

Here’s a real-life example (human microbiomes) trying out your code. I calculated the pairwise Bray Curtis dissimilarity and plotted it as you suggested:

So here the peak is closer to c ≈ 0.6. I’m guessing that the discrepancy is because, in a random matrix, there probably aren’t axes that describe a disproportionate amount of the variation. Looking at a scree plot, for my data:

Compared to one for a random matrix with the same number of dimensions:

I find this genuinely interesting, as I was expecting the pcoa to maintain distances. I wonder - could this be related to the eigenvalue ratio, as suggested in the F1000 paper? And does anyone else here know anything about this? :slight_smile:

I wonder - could this be related to the eigenvalue ratio, as suggested in the F1000 paper?

If it is, it’s not straightforward. Using my real life example, I took some subsets of the data and calculated the aspect ratio with the highest correlation (maxc), what that correlation was (maxcor) and the ratio of eigenvalues. Here’s what I found:

function maxcor(mds, dm; dim1=1, dim2=2, rng = 0.1:0.01:3)
    proj = projection(mds)
    x, y = (proj[:,i] for i in [dim1, dim2])

    e1, e2 = eigvals(mds)[[dim1, dim2]]
    
    res = [getcor(i, dm, x, y) for i in rng]
    (mc, i) = findmax(res)
    return (maxc = rng[i], maxcor=mc, eigratio=e2/e1)
end


function ploteigs(dm; dims=50, points=200)
    c = Float64[]
    er = Float64[]
    cor = Float64[]
    
    for _ in 1:points
        picks = rand(1:size(dm, 1), dims)
        subdm = dm[picks, picks]
        mds = fit(MDS, subdm, distances=true)

        for i in 1:5
            mc = maxcor(mds, subdm, dim1=1, dim2=i)
            push!(c, mc[:maxc])
            push!(er, mc[:eigratio])
            push!(cor, mc[:maxcor])
        end
    end
    
    println(length(c))
    scatter(c, er, zcolor=cor)
end

ploteigs(dm)

Just for funsies, here it is with a random matrix:

You should set aspect_ratio = 1 in those plots, given that the scales of x and y are supposed to be the same :smiley:

:confounded:

1 Like

hahahaha.
What’s going on with all those where the eigenvalue ratio is 1?

No idea… we really need someone that knows what they’re talking about to join this conversation :stuck_out_tongue:

1 Like

BTW I think it’s a bug that Plots adjusts the image size by the aspect_ratio. It should use aspect_ratio to redefine the x and y lims

One more data point (in your favor @mkborregaard) is here (see Tip 6 and figure 2).

I’m still not convinced though.

This came up in my lab’s slack discussion. Someone found the reason for all of those eigenvalues ratio = 1 points…

        for i in 1:5
            mc = maxcor(mds, subdm, dim1=1, dim2=i)

should have been 2:5 :man_facepalming:

A couple more comments from Siyuan Ma, a grad student in the Huttenhower lab:

Pasting the slack thread here:

So do we end up agreeing with the faculty1000 paper that aspect ratio should be equal to the proportion of the variances?

Possibly. To complete the picture - using the rows of the transform rather than the columns of the projection:

using StatsPlots, Distances, MultivariateStats

m = rand(100, 10)
dm = pairwise(BrayCurtis(), m, dims=2)
mds = fit(MDS, dm, distances=true)

tfm = collect(transform(mds)')

x, y = (tfm[:,i] for i in 1:2)

using Statistics

function getcor(c, dm, x, y)
       fin = pairwise(Euclidean(), vcat(x', c .* y'), dims = 2)
       cor(vec(fin), vec(dm))
end

res = [getcor(i, dm, x, y) for i in 0.1:0.01:3]
plot(0.1:0.01:3, res)

but for real data:

And doing my simulation:

function maxcor(mds, dm; dim1=1, dim2=2, rng = 0.1:0.01:3)
    tfm = collect(transform(mds)')
    x, y = (tfm[:,i] for i in [dim1, dim2])

    e1, e2 = eigvals(mds)[[dim1, dim2]]
    
    res = [getcor(i, dm, x, y) for i in rng]
    (mc, i) = findmax(res)
    return (maxc = rng[i], maxcor=mc, eigratio=e2/e1)
end


function ploteigs(dm; dims=50, points=200)
    c = Float64[]
    er = Float64[]
    cor = Float64[]
    
    for _ in 1:points
        picks = rand(1:size(dm, 1), dims)
        subdm = dm[picks, picks]
        mds = fit(MDS, subdm, distances=true)

        for i in 2:5
            mc = maxcor(mds, subdm, dim1=1, dim2=i)
            push!(c, mc[:maxc])
            push!(er, mc[:eigratio])
            push!(cor, mc[:maxcor])
        end
    end
    
    println(length(c))
    scatter(c, er, zcolor=cor, aspect_ratio=1)
end

ploteigs(dm)

Continuing the discussion from Proper use of scaling on axes from ordinations (like MDS and PCA)?:

Coming to this late, but I noticed hits from this page on my blog and was curious.

I think we have to be a little careful when speaking in generalities especially when comparing domains of usage of these methods. Vegan has more of a focus on PCA/RDA, CA/CCA, and NMDS rather than PCoA (principal coordinates analysis, PCO, classic MDS) and with those methods we generally prepare the plots with aspect ratio 1. This is not to say that we don’t rescale the axis scores (the eigen vectors); we do scale these but we still plot on equally scaled axes.

I’m intrigued to follow up on the advice from Gower; from what I cribbed above, though, this is a transformation of the data, the eigen vectors, prior to plotting. I’d be very surprised if plotting was done with an aspect ratio other than 1, if the interpretation relies on a distance interpretation on the plot as an approximation of the true distance.

That said, it’s not unheard of to use well-justified aspect ratios that differ from 1; take coinertia analysis and co-correspondence analysis, which compare two matrices of multivariate data, where adjusting the aspect ratio to be the quarter root of the ratio of the Eigenvalues of the axes being displayed is the optimal way to display the covariances between the two matrices. This produces a very-specific plot from the broader world of ordination, and which has some specific properties, but it’s been so long since I looked at these things I can’t recall the details currently.