Learning-in-Data-Science-(general)

# Learning in Data Science (general) In general whenever we are trying to learn something, we are dealing with 4 key components: 1. **Learning Algorithm** (ex. back propagation) 2. **Parameters** (ex. weights in a neural network) 3. **Model** (Bayes Net, Markov Network, HMM, Markov Chain, Linear regression, Neural net) 4. **Objective Function** (Mean squared error, Maximum likelihood) We can think of the learning process as follows: 1. We have some objective phenomena that exists in the real world. This is our underlying reality. 2. We have data/observations that (hopefully) correspond to this reality, capturing certain patterns and regularities. These observations *describe* a portion of our reality. 3. We wish to **learn** a **model** that captures certain properties and phenomena about this reality. In physics this may be a differential equation that effectively captures a law of motion, such as a ball through the air. In data science this may be a neural network that effectively learns the fundamental underlying relationship between the arrangement and intensities of pixels in an image and what the image represents (such as a dog vs. a cat). Or in unsupervised learning that may be trying to learn a high dimensional joint distribution that captures the probability of certain observations based on the reality of the situation. 4. Generally, the data scientist must select a **model** that they believe can capture the underlying regularities and patterns in the data, which again will ideally capture the patterns existing in reality. These models have different assumptions, constraints, efficiencies, and so on. 5. A model will always have **parameters**. These parameters are essentially what are updated during the learning process. If we relate this to a human brain we can think of the brain consisting of synaptic connections with varying potentials. As a human learns these potentials are strengthened in certain areas, new connections are made, and so on. 6. In order to learn our model parameters we need two things: an **objective function** and a **learning algorithm**. 7. The **objective function** has our parameterized model map to a value. We often then try to minimize (squared error, cross entropy) or maximize (log likelihood) this value. 8. In order to optimize this value, we need a **learning algorithm** (such as backpropagation or expectation maximization). These algorithms need some way of *moving* the parameters. This is generally done via the use of calculus (gradients in gradient descent) or algorithms (such as assign to a cluster and then recalculate). 9. At the end of this we will ideally have a model that effectively represents the reality that we are interested in capturing. --- Date: 20210706 Links to: [003-Data-Science-MOC](003-Data-Science-MOC.md) Tags: #review References: * [Lecture 19 - RBMs](https://www.cse.iitm.ac.in/~miteshk/CS7015/Slides/Handout/Lecture19.pdf)