# Reinforcement Learning
Reinforcement learning is mode seeking as opposed to distribution matching [listen here](https://overcast.fm/+aYlPxjgs0/1:21:55)
### Quick Intuition
* A **policy** is just a probability distribution that our agent uses to select an **action**
* In order to guide your policy in the direction you would like we can use **reward shaping**. Note that reward shaping can be [very challenging](https://youtu.be/JgvyzIkgxF0?t=742).
### Good Resources
* Learning with sparse rewards [Reinforcement Learning with sparse rewards - YouTube](https://www.youtube.com/watch?v=0Ey02HT_1Ho)
---
Date: 20231217
Links to:
Tags:
References:
* [Deep Reinforcement Learning Doesn't Work Yet](https://www.alexirpan.com/2018/02/14/rl-hard.html)