Thanks for the reply. I was expecting a distance between each variable (column) and the envelope of the reference set. That would be 500 x 3 in this case rather than a distance for every single observation and each member of the reference set because that’s the point of the Mahalanobis distance.
I figured that the zeros on the diagonal was for the reason you say and that makes perfect sense but what I was hoping to understand was why the Python example gets a single value (that appears to come for the diagonal) for the distance of each observation to the reference set. I have seen other examples worked in Excel and elsewhere that also return a single value for Mahalanobis distance, so I presumed there was a method to ‘collapse’ all the distances to a single value (root mean square of a row of distances - is that legitimate?).
EDIT: I implemented the python example exactly as it is in the link and I get a 500 x 500 array out, from which they are taking the diagonal.
EDIT 2: It turns out that there is an alternative version of the Mahalanobis formula that which uses distances from each observation to the central mean, which appears to be what the Python version is doing from looking at the code. This explains the dramatically different results but begs the question how we can call both these things “Mahalanobis distance”?