AdaBoost - Nate's Notes

# AdaBoost We can summarize the three main ideas behind **AdaBoost** (short for adaptive boosting) as follows: 1. AdaBoost combines a lot of "weak learners" to make classifications. These weak learners are almost always **stumps** ([Decision Trees](Decision%20Trees.md) with depth of 1). 2. Some stumps get more say in the classification than others. 3. Each stump is made by taking the previous stump's mistakes into account. We can think of our steps as follows: 1. We start by building our first stump (weak learner). To do so, we iterate over our features, selecting that which is the best for our first split (that which provides us with the most information, reduces entropy, minimizes Gini Index etc. There are some different choices we can make here, but they are all similar). 2. Once we have our first stump, we need to determine how much *say* it has in the final classification. We determine the amount of say a stump has in the final classification based on how well it classified the samples. The more samples it classified correctly, the more say it has. We us the total error to determine this, it generally looks like: $\text{Amount of say} = \frac{1}{2} log(\frac{1 - \text{Total Error}}{\text{Total Error}})$ Note that the amount of say curve looks like a sigmoid curve rotated 90 degrees clockwise. Note also that when determining the amount of say, the error is weighted by each sums weight/importance. These weights are directly influenced based on previous stumps. So, if a data point has not been classified correctly by previous stumps, that points weight will be very high, ensuring that future stumps prioritize classifying it correctly. We will also decrease the importance of correctly classified samples. 3. We then build a new bootstrapped dataset, sampling (with replacement) from our original dataset. We sample each point with probability equal to it's weight. --- Date: 20211220 Links to: Tags: References: * [AdaBoost, Josh Starmer youtube video](https://www.youtube.com/watch?v=LsK-xG1cLYA) * [My blog post](https://www.nathanieldake.com/Machine_Learning/06-Ensemble_methods-05-AdaBoost.html)