Kernel density estimation is a way to approximate the distribution of a dataset just by looking at the points in the dataset. KDE does not make very many modeling assumptions, so it’s a general technique to model the probability distribution that generated your data. To understand KDE, we need some intuition about the kernel function $k(x,q)$. This function should be large when $x$ and $q$ are similar and small when they are not. If the kernel has these properties, we call it a similarity kernel because it measures whether $x$ and $q$ are close to each other. ![500](Screen%20Shot%202022-10-30%20at%2010.29.23%20AM.png) By adding together all of the $k(x,q)$ values, we can measure whether $q$ fits in with the observed data. If $q$ is similar to lots of examples from the dataset, many of the kernel values will be large and we will get a large sum. If $q$ is not similar to any examples, the kernel sum will be small. The KDE model just divides the sum by a normalization constant so that the probabilities are between 0 and 1. The model assigns a high probability density to a query if it is close to many examples and a low density otherwise. --- Date: 20221030 Links to: [Machine Learning MOC](Machine%20Learning%20MOC.md) Tags: #review References: * [RACE Sketches for Kernel Density Estimation - Randorithms](https://randorithms.com/2020/09/15/RACE-KDE.html)