Expected Value - Nate's Notes

# Expected Value The expected value is a mathematical construction that is meant to capture the notion of "what outcome would I observe (on average) if I repeated this experiment many times?" For discrete systems this is defined as: $E[X] = \sum_x x P_X(x)$ It is worth nothing that the expected value is not a random variable (but if we are dealing with conditional expectation this will change). Consider the system below: ![](images/Probability%20(models%20and%20inequalities).png) We see that $X$ maps points $\omega$ from our sample space $\Omega$ to $\mathbb{R}$, and the resulting points have areas of high and low density. If each point has the same probability of being observed, i.e. $P_X(x) = \frac{1}{|\Omega|}$, i.e. uniform, then the expected value will return the average value that we would expect to see if we performed many experiments. Likewise, if our points have different probabilities of observations (i.e., say the points in the dense region-of which there are many-have a small probability of actually being observed), the expected value will account for that as well, with those values contributing less to the final result. Now, each point $x$ will be multiplied by (/posts/see [Multiplication as of](Multiplication%20as%20of.md)) it's probability of occurrence, $P_X(x)$, yielding an *area*: ![](images/Probability%20(models%20and%20inequalities)%203.png) If we sum up all of these areas, across all values of $\omega$ in our sample space, the final sum is our expected value. What is great is that we can see that visually our expected value is perfectly captured via the area under the [CDF](Cumulative%20Distribution%20Function.md) (/posts/technically the [Survival-Function](Survival-Function.md)). ![](images/Probability%20(models%20and%20inequalities)%205.png) So, we can write our expected value as follow: $E[X] = \int_0^{\infty} F_X^C{x} dx$ Now besides this being a nice way of calculating expected value in general, it is also helpful in the construction of [Probability-Inequalities](Probability-Inequalities.md)! ### A note on notation: Random vs Non Random variables Consider our basic discrete system again: $E[X] = \sum_x x P_X(x)$ On the left hand side we have $X$ which represents a [Random Variable](Random%20Variable.md) that follows a specific probability distribution, in this case $P_X$. On the right hand side we are summing over $x$ which represents a *specific value* that $X$ can take. So this $x$ is *not random*, but represents a concrete outcome in the domain of $X$. See more in [this response](https://chat.openai.com/share/9c8e1e7e-e2d8-4bb8-9ffb-b02c35e8a298). ### A note on subscripts If we have an expectation with many terms inside of it we can assume that we are computing the expected value *with respect to all terms that are random variables* inside the expectation. $E[f(X, Y, Z)] = \int_X \int_Y \int_Z f(X, Y, Z) \; dX \;dY\; dZ $ However, if we start adding subscripts to $E$ we can assume that we are restricting the joint distribution that we are taking the expectation over. For example: $E_X[f(X, Y, Z)] = \int_X f(X, Y, Z) \; dX $ Above, we are effectively saying that $Y$ and $Z$ are fixed (determined) and $X$ is the only random variable. See more here:[neural networks - What does it mean to take the expectation with respect to a probability distribution? - Cross Validated](https://stats.stackexchange.com/questions/487095/what-does-it-mean-to-take-the-expectation-with-respect-to-a-probability-distribu) --- Date: 20211008 Links to: [Probability MOC](Probability%20MOC.md) [Multiplication as of](Multiplication%20as%20of.md) Tags: References: * []()