# Log Scale
In order to think clearly about what it means to use a log scale, consider the plots below. On the left we have a regular, linear scale. On the right the same data is visualized via a log scale.

Why might we want to do this? Well to start, consider how ineffective the visualization on the left is. It looks as though all probability mass is centered right at 0. And it very well may be. However, we may want to know what is going outside of that tiny band. A simple way to enable this is using a log scale. We can see much more in the right hand plot what is going on outside of the region right at 0. We just have to remember the massive order of magnitude difference in the y scale.
### Log scale is about *differences*
But what is the scale really "about"? A log scale may be useful if you are more interested in **relative differences** instead of **absolute differences**. For instance, consider the following example:
```python
x1 = 10
x2 = 100
x3 = 1000
x2_diff_x1 = 90
x3_diff_x2 = 900
```
In some applications those two differences are *not the same* - they are qualitatively different. However, in certain applications (consider financial forecasting) we may view the difference between 10 and 100 as roughly equivalent to the difference between 100 and 1000. This is particularly true when designing error metrics where as the magnitudes of the input we are trying to predict *grows* we are far more interested in how close we are relatively, not absolutely.
For instance, say we have a target `y` and a prediction `y_hat`:
```python
y = [10, 100, 1000]
y_hat = [5, 50, 500]
```
We can compute error in two ways here - one simply computing the difference, and the other computing the relative difference via a log:
```python
mae = np.abs(y - y_hat)
mae_log = (np.log(y) - np.log(y_hat))
```
Look at how drastic the difference is in error:

If your problem was mainly concerned with how close you were in terms of order of magnitude, the log transformation is *invaluable*. We see that all predictions have the same error when first being log transformed. If we don't first log transform, the prediction of `500` when the true value is `1000` has by far the largest error.
Not that this is effectively what the metric **Mean Absolute Percentage Error** achieves. It is another way to ensure we are looking at *relative *
### Leaving off
TODO: Add notes from books on geometric mean and (I forget what books they were, should be on shelf)
Are logarithms a way of making relative change additive? Think about log scale plots , this should make the difference between 1 and 10 the same as 100 and 1000. They are the same distance away in log space!!!!!!!! Write in obsidian!!!! Pg 91 misbehavior market. what a beautiful transform. Maybe put some beautiful figures together for this. It is a relative concept. Relative change. It is focused on ratios in a sense. Think about geometric growth.
---
Date: 20231111
Links to: [Logarithms](Logarithms.md)
Tags:
References:
* [How to Lie with Histograms — Nicholas A. Rossi](https://www.rossidata.com/LinLog)