Quantile Regression - Nate's Notes

# Quantile Regression ### Overview Normally we are interested in finding the *conditional mean*: $\mathbb{E} (y|x)$ But we may also consider the *conditional median*: $Median(y|x)$ Or more generally, the *conditional percentiles*, $\tau \in (0, 1)$: $Percentile_{\tau}(y|x)$ To understand this is it helpful to first define what a **quantile** is. > **Definition: Quantile** > The $\tau^{\text{th}}$ quantile of $y$ is $\mu_{\tau}$ such that: > $\tau = P(y \leq \mu_{\tau}) = F_y(\mu_{\tau})$ > Hence: > $\mu_{\tau} = F_y^{-1}(\tau)$ To spell this out a bit more, let's say $y$ is a set of prices. Then $\mu_{\tau}$ will be a member of the set of prices, such that if we sorted all prices in $y$, the value $\mu_{\tau}$ would be located in this sorted set such that it split the set so that the fraction $\tau$ elements lie *before* it, and the fraction $1 - \tau$ lie *after* it. We can visualize this below as follows. First, let us look at the [Cumulative Distribution Function](Cumulative%20Distribution%20Function.md) of $x$. This takes in elements of the domain, in this case $x$ (of which $\mu_{\tau} \in x$), and returns values in the codomain of $(0, 1)$: ![](Screenshot%202023-05-22%20at%207.30.55%20AM.png) We can then simply take the inverse CDF, map $\tau$ through it, and get back the quantile $\mu_{\tau}$: ![](Screenshot%202023-05-22%20at%207.33.04%20AM.png) The **conditional quantile** is the same thing except that our CDF is now conditional on $x$: > **Definition: Conditional Quantile** > The conditional $\tau$ quantile of $y$ is : > $\mu_{\tau} = F^{-1}_{y|x}(\tau|x)$ Now let's look at another nice example. Consider we have the following dataset: ![](Screenshot%202023-05-22%20at%207.37.22%20AM.png) Say that we want to look at quantiles conditional on some $x$. Here is what that may look like if we conditioned on $x \approx 5.3$: ![](Screenshot%202023-05-22%20at%207.38.11%20AM.png) So above we are looking at a specific $x$ (well, almost - it technically needs to be infinitely narrow), and given this $x$ we have a distribution of $ys. We can then compute the quantiles of that distribution! These quantiles can be seen below (looking at the $[5, 10, \dots, 95]$ quantiles): ![](Screenshot%202023-05-22%20at%207.39.12%20AM.png) We can do this *across all $x$* and see what this looks like. Notice that the 95th percentile is *increasing* as $x$ *increases* (the highest red point in each vertical line is increasing). ![](Screenshot%202023-05-22%20at%207.39.56%20AM.png) What quantile regression does is draw a straight line through these points. For instance, here is the regression line for $\tau = 0.95$: ![](Screenshot%202023-05-22%20at%207.42.17%20AM.png) We can also do for $\tau = 0.5$ (the median): ![](Screenshot%202023-05-22%20at%207.46.04%20AM.png) As well as $\tau = 0.05$: ![](Screenshot%202023-05-22%20at%207.45.53%20AM.png) This is what quantile regression does. ### The loss function ![](Screenshot%202023-05-22%20at%207.55.09%20AM.png) ![](Screenshot%202023-05-22%20at%207.57.50%20AM.png) ![](Screenshot%202023-05-22%20at%207.58.20%20AM.png) ![](Screenshot%202023-05-22%20at%208.00.09%20AM.png) ### Some Details ![](Screenshot%202023-05-19%20at%202.42.07%20PM.png) Note: If you are trying to predict a rank/quantile it can be a bit confusing because you will be estimating the quantile of a quantile. So, say for the conditional quantile, the result of $F_Y^{-1}(\alpha | X = x)$ will be a quantile - the quantile of rank! An easier way to think about perhaps may be: $F_Y: Y \rightarrow [0,1]$ $F_Y^{-1}: [0,1] \rightarrow Y$ Where $Y$ is simply the space that our target lives in (so $Y$ could be a price, weight, rank, etc). ### Pinball Loss The Pinball loss is a generalization of the MAE (Mean Absolute Error), which is used to find the *conditional median*. The pinball loss will yield a prediction of the conditional *quantile*. ![](Screenshot%202023-05-19%20at%203.05.10%20PM.png) The key idea is that the pinball loss is constructed so that there is a different penalty for over-vs-underestimation. This ends up yielding predictions that match arbitrary quantiles we select. A nice proof that this does indeed yield the right thing is here: [Massimiliano Ungheretti- Modelling the extreme using Quantile Regression| PyData Global 2020 - YouTube](https://youtu.be/GpRuhE04lLs?t=1094). We really want to show that minimizing the pinball loss is equivalent to finding the quantile. ![](Screenshot%202023-05-19%20at%203.07.46%20PM.png) ![](Screenshot%202023-05-19%20at%203.13.11%20PM.png) --- Date: 20230519 Links to: Tags: References: * [Quantile Regression Definition](https://www.lokad.com/quantile-regression-(time-series)-definition) * [Pinball Loss Function Definition (Quantile Loss)](https://www.lokad.com/pinball-loss-function-definition) * [Massimiliano Ungheretti- Modelling the extreme using Quantile Regression| PyData Global 2020 - YouTube](https://www.youtube.com/watch?v=GpRuhE04lLs) * [An introduction to quantile regression - YouTube](https://www.youtube.com/watch?v=pAKwoz05lK4) * [Quantile regression: The criterion function - YouTube](https://www.youtube.com/watch?v=sgR55l054DQ)