# Distance (Metric) One of the biggest paradigm shifts I’ve had as a DS is the realization of how important _and broad_ the concept of distance is! You have likely heard of euclidean distance, which is just the normal distance that we tend to think about in the 2d and 3d world. We can generalize this to higher dimensions via some reasonable rules. But what I found amazing (and mind blowing) is that, mathematically, _we get to choose our distance function_! And one that we could pick is euclidean, but there are infinitely many other distance functions we could pick. This video touches on a few non euclidean ones: [https://www.youtube.com/watch?v=Usngvpiv_LI](https://www.youtube.com/watch?v=Usngvpiv_LI) Formally this leads us to the notion of a [metric space](https://en.wikipedia.org/wiki/Metric_space). Again - for each different distance function you pick, you’d get a different metric space! Say we have a space of points and we have the euclidean distance function and some other distance function, `d`. These _same points_ may be _close together in terms of euclidean distance_ and _far apart in terms of our other distance function_ `_d_` . This is where part of the “creativity/art” of DS and mathematics comes in. You may pick a distance function that aligns with the problem you are trying to solve (i.e. one that ensures elements that you deem similar have a small distance between them, and those deemed different have a large distance). This notion is incredibly deep and can really take you down an absolute rabbit hole. A few more interesting links I’ll toss out there: - [There is a lemma](https://www.math3ma.com/blog/the-yoneda-embedding) that states that “An object is completely determined by its relationships to other objects” - the minute we think about relationships between objects we are (at least informally) talking about distance (it may go be a different name depending on what the objects are, but the bigger point of “relationships between objects” remains the same) - There are “distances” between probability distributions (no longer between points! Our “objects” are now probability distributions. A few examples: [earth movers distance](https://jeremykun.com/2018/03/05/earthmover-distance/), [information distance](https://jeremykun.com/2012/12/04/information-distance-a-primer/), [KL divergence](https://en.wikipedia.org/wiki/Kullback%E2%80%93Leibler_divergence)) - The volume of an n ball in euclidean space _goes to zero as the dimensionality increases!_ (see [here](https://divisbyzero.com/2010/05/09/volumes-of-n-dimensional-balls/)) - This actually has fascinating implications in the domain of probability theory - namely, in high dimensions, [Gaussian distributions are practically indistinguishable from uniform distributions on the unit sphere](https://www.inference.vc/high-dimensional-gaussian-distributions-are-soap-bubble/) --- Date: 20231031 Links to: Tags: References: * []()