KD-Tree - Nate's Notes

# KD Tree A k-d tree (short for $k$-dimensional tree) is a space-partitioning data structure for organizing points in $k$-dimensional space. $k$-d trees are a useful data structure for several applications, such as searches involving a multidimensional search key (range searches and nearest neighbor searches) and creating point clouds. The _k_-d tree is a [binary tree](https://en.wikipedia.org/wiki/Binary_tree "Binary tree") in which _every_ node is a _k_-dimensional point. Every non-leaf node can be thought of as implicitly generating a splitting [hyperplane](https://en.wikipedia.org/wiki/Hyperplane "Hyperplane") that divides the space into two parts, known as [half-spaces](https://en.wikipedia.org/wiki/Half-space_(geometry) "Half-space (geometry)"). Points to the left of this hyperplane are represented by the left subtree of that node and points to the right of the hyperplane are represented by the right subtree. The hyperplane direction is chosen in the following way: every node in the tree is associated with one of the _k_ dimensions, with the hyperplane perpendicular to that dimension's axis. So, for example, if for a particular split the "x" axis is chosen, all points in the subtree with a smaller "x" value than the node will appear in the left subtree and all points with larger "x" value will be in the right subtree. In such a case, the hyperplane would be set by the x value of the point, and its [normal](https://en.wikipedia.org/wiki/Surface_normal "Surface normal") would be the unit x axis. ### Nice Overview The way the algorithm works is as follows: 1. Pick a random dimension. Find the median of that dimension. This is $O(N)$. Split the data into two groups based on that median. 2. Repeat with a new dimension (for each group). This recursive nature is why we end up with a tree structure. Visually this can be seen nicely below: ![](KD%20tree%20algorithm_%20how%20it%20works%201-11%20screenshot.png) Geometrically this is a beautiful concept. Each time we perform a split we are really *partitioning* our space by a **hyperplane**. The final result is that we have fractured/partitioned our space into **hypercubes**, and we have a tree data structure. So given a new data point, we can simply walk down the tree until we end up in one of the hypercubes. Clearly from a nearest neighbor perspective this is an *approximate technique*. --- Date: 20210824 Links to: [Algorithms and Data Structures](Algorithms%20and%20Data%20Structures.md) Tags: #review References: * [KD Tree Algorithm: How it works (Victor Lavrenko)](https://www.youtube.com/watch?v=TLxWtXEbtFE) * [Wikipedia](https://en.wikipedia.org/wiki/K-d_tree)