Dimensionality-Reduction-and-Unsupervisd

# Unsupervised and Dimensionality Reduction * We need the right pieces of information to make optimal decisions * Realistically, there are an infinite number of dimension that we could consider * However, simply increasing the number of dimensions is not going to get us anywhere. We become overwhelmed by choices, the interaction effects are unwieldy, and we lose our signal in noise. See the [horse racing study](Prediction%20Quality%20vs%20Amount%20of%20Information%20in%20Horse%20Racing.md) as a prime example of this. * So, what is a standard way of dealing with a large number of dimensions as it relates to making decisions? * Approach 1: Use heuristics that tend to be applied in your particular area to select a subset of the variables to consider. This is very prone to bias. * Approach 2: Use machine learning dimensionality reduction techniques to reduce the number of dimensions to something more manageable, where each dimension has a larger portion of information content than the original * Note: it is worth noting that the reason we can't simply handle a massive number of dimensions is due to the curse of dimensionality. Even with concentration of measure, we still will need to deal with our distance and similarity metrics becoming less useful, so we must reduce our dimensions some how. * Both of these approaches has draw backs. The second approach has the main issue that the resulting variables are not interpretable or understandable by humans. There are in a sense transformations of the original variables, but they don't resonate with the human mind. * Enter unsupervised and fabric. * By using encodings and semantics we can consider *all* human understandable features that could be created in a data set (i.e. that can be created via transformations of the original features), and we can use semantics to help pair down that massive list to a more manageable list * Then, based on this resulting feature set, we can use PF and DR to optimize the resulting patterns, where we find those that have nuance wrt the KPI * So, unsupervised is a human understandable form of dimensionality reduction * Once we have a reduced dimensionality we can more appropriately use distance functions * QUESTIONS: * harden how pattern find fits in here? * how does it exploit a low dimensional space? * how does having the label (KPI) play a role here? * **Why would pattern find not work with a much larger dimensional space?** Is it simply computational resources? Or is there a fundamental flaw (see euclidiean distance not working?)