# High Dimensional Spaces - Post Outline #### Fill in the Top Box The goal here is to get clear statement of the subject that we are discussing, as well as the answer, written down. 1. **What subject are you discussing?** The Curse of Dimensionality and the Age of Big Data. 2. **What question are you answering in the readers mind[^2] about the subject?** In the Age of Big Data, it could appear that bringing in more data (uncorrelated) is always a good thing, but is it really? 3. **What is the answer?** No! The curse of dimensionality and the consequences of high dimensional spaces must be considered! #### Match the Answer to the Introduction Now that we have a first pass at a statement of the subject and answer, we want to make sure it is as clear as possible. To do that we can take the subject, move up to the *situation* and make the first noncontroversial statement about it that you can. This should lead to a question of "So what?". This answer to this question is our *complication*. 4. **What is the situation?** Ever since the scientific revolution, and certainly over the past 50 years, making decisions based on more data has been viewed as a best practice[^4]. Meanwhile, the Curse of Dimensionality and properties of High Dimensional Spaces are still present. 5. **Develop the complication.** The above two points are logically incompatible! Bringing in more data will amplify the curse and challenges associated with High Dimensions. Along the way the philosophy developed that *more data is always better*. But is that really true? NO! https://docs.google.com/presentation/d/1OxbV_-gDFrHnI9Cs2lZRirw6tyzIMVlLkQP4KClfWmo/edit#slide=id.p [^2]: Tip: Visualize your reader (you should have a reader in mind when writing. Mine is often myself from a few months ago.) To whom are you writing and what question do you want to have answered in his mind when you are done writing. In this case, I am writing to Nate from August 2020. I want him to understand the nuances of high dimensional spaces and why caution should be considered when adding more data. [^4]: See John Locke and other Empiricists. The idea here is that science started to become empirical (which was a good thing), but as computers and databases became prevalent this idea of data driven decisions began to spiral out of control. This idea became ingrained in out scientific culture, specific examples were chosen to push forward the narrative. And, to cap it all off, the work of Daniel Khaneman on Biases made this all the worse. In fairness, we did indeed need to fight back against our implicit biases and failing heuristics, but simply bringing in *more data* is not the whole solution.