The Mahalanobis distance is a measure of the distance between a point P and a distribution D, introduced by P. C. Mahalanobis in 1936. It is a multi-dimensional generalization of the idea of measuring how many standard deviations away P is from the mean of D. This distance is zero for P at the mean of D and grows as P moves away from the mean along each principal component axis. If each of these axes is re-scaled to have unit variance, then the Mahalanobis distance corresponds to standard Euclidean distance in the transformed space. The Mahalanobis distance is thus unitless, scale-invariant, and takes into account the correlations of the data set.
Mahalanobis distance is just a way of stretching Euclidian space to achieve a certain sort of isotropy (it normalizes an ellipsoid to the unit sphere). It is built on top of Euclidian distance and is not an alternative to it.
Euclidian distance works well in 2D and 3D as special cases. I would say Mahalanobis distance is its generalization (yes, built on top of it), which works better in the multidimensional (multivariate) case.
No. Mahalanobis distance is not an alternative to Euclidian distance because it's not even measuring the same kind of distance. The are incommensurate, both figuratively and literally: Mahalanobis distance is unitless while Euclidian distance is not.
Euclidian distance measures the distance between two points, while Mahalanobis measures the distance between a distribution (canonically multivariate normal) and a point. Mahalanobis distance is not a generalization if Euclidian distance, it's an altogether different concept of distance that doesn't even make sense without talking about a distribution with mean and covariance matrix.
I agree about your larger relevant point but the following that you say is bit of a red herring
> Euclidean distance measures the distance between two points, while Mahalanobis measures the distance between a distribution (canonically multivariate normal) and a point
In a discussion about metric and metric spaces we dont care about those things, its abstracted out and considered irrelevant. All that matters is that we have a set of 'things' and a distance between pairs of such things that satisfies the properties of being a distance (more precisely, properties of being a metric).
@CrazyStat (I cannot respond to your comment so leaving it here)
I think you overlooked
> things that satisfies the properties of being a distance (more precisely, properties of being a metric).
that I wrote. Of course it has to satisfy the properties of being a metric. The red herring, as far as dimensionality is concerned, is the complaint that Mahalanobis is defined over distributions while Euclidean is over points.
The part about MD that you get absolutely right is its nothing but Euclidean distance in a space that has been transformed by a linear transformation. MD (the version with sqrt applied) and ED aren't that different, especially so in the context of dimensionality
@CrazyStat response to second comment.
It indeed isnt, its just Euclidean distance under linear transformation. I was just quoting you, you had said
> while Mahalanobis measures the distance between a distribution
My point was even it is defined for distributions its not really relevant.
> Mahalanobis "distance" is more closely related to a likelihood function than to a true distance function.
That's a subjective claim, and open to personal interpretation. Mathematically MD is indeed a metric (equivalently a distance) and it does show up in the log likelihood function. Mahalanobis was a statistician, but MD is a bonafide distance in any finite dimensional linear space, with possible extensions to infinite dimensional spaces by way of a positive definite kernel function (or equivalently, the covariance function of a Gaussian process)
A function that measures the distance between two different classes of 'things' (distribution and point, in this case) is necessarily not a distance metric. It trivially fails to satisfy the triangle inequality, because at least one of d(x,y), d(x,z), d(y,z) will be undefined--no matter how you choose x, y, z you'll end up either trying to measure the distance between two points or the distance between two distributions, neither of which can be handled.
This is not a red herring, it's a fundamental issue.
MD isn't defined over distributions, though. There are perfectly good distance metrics defined over distributions, but MD isn't one of them. It's a "distance" between one distribution and one point, not between two distributions or two points.
Mahalanobis "distance" is more closely related to a likelihood function than to a true distance function.
Mahalanobis distance isn't that different from euclidean distance at all as far as effects of dimensions is concerned it just applies a stretch, rotation or more accurately a linear transformation to the space.
In short, much that I love Mahalanobis distances' many properties it does zilch for dimensionality.
The Mahalanobis distance is a measure of the distance between a point P and a distribution D, introduced by P. C. Mahalanobis in 1936. It is a multi-dimensional generalization of the idea of measuring how many standard deviations away P is from the mean of D. This distance is zero for P at the mean of D and grows as P moves away from the mean along each principal component axis. If each of these axes is re-scaled to have unit variance, then the Mahalanobis distance corresponds to standard Euclidean distance in the transformed space. The Mahalanobis distance is thus unitless, scale-invariant, and takes into account the correlations of the data set.
https://en.wikipedia.org/wiki/Mahalanobis_distance