Yes, I think aspect ratio should be equal, because the units of x and y are the same. I think it gives the most fair image, also of the degree of variation on each axis.
I also believe it is the convention - on Slack I came with a number of references to the use of equal aspect ratio:
This blog post from one of the developers of R’s vegan package (the R package for community ecology) recommends customising ordination plots by setting aspect ratio to 1: Customising vegan's ordination plots
So does the owner of the ggvegan package, R’s package for making ordination plots with ggplot2: Recreating autoplot.cca() from fortify.cca() data frame · Issue #9 · gavinsimpson/ggvegan · GitHub
So does the asker of this SO question: plot - R - how to make PCA biplot more readable - Stack Overflow
Here’s a paper that argues for a different method - scaling pca plots by the eigenvalues on the axes. That must be considered an alternative method. They even write how the peer reviewers remarked on the aspect ratio not being equal https://f1000research.com/articles/5-1492/v2
For MDS, which is distance-preserving, I would also argue that distances are best preserved with aspect ratio 1. I laid out this argument and small toy analysis for it:
When you plot the points, it translates the x and y values into pixel positions on the screen. this is equivalent to multiplying x with a constant a
that depends on the pixel concentration, the extent of the x axis in screen pixels and the values associated with the x axis limits, and multiplying y with b
that depends on the same attributes of the y axis
The aspect ratio is then defined as c = a/b
, so given that all of these are constant the plot coordinates are proportional to multiplying y with c
.
Now if we take an example
using StatsPlots, Distances, MultivariateStats
m = rand(100, 10)
dm = pairwise(BrayCurtis(), m)
mds = fit(MDS, dm, distances=true)
proj = projection(mds)
x, y = (proj[:,i] for i in 1:2)
We’ve fit a PCoA to the dm
distance matrix of distances among m
, and extracted the first two components, x and y, which is the translation into two dimensions that best maintains the distances between points. Now let’s calculate the correlation between the distances on the plot and the distances among the input points. To do that we need to multiply by the aspect ratio, c
.
using Statistics
function getcor(c, dm, x, y)
fin = pairwise(Euclidean(), vcat(x', c .* y'), dims = 2)
cor(vec(fin), vec(dm))
end
res = [getcor(i, dm, x, y) for i in 0.1:0.01:3]
plot(0.1:0.01:3, res)
Here’s the result
1 looks like a good aspect ratio here. It would be cool to see this for a real-life code case with non-Euclidean distances, as that is what MDS is often used for.