Science at Startups - Nate's Notes

# Science at Startups What does it mean to do good science *in a startup environment*? We almost certainly are going to be working on [High Surface Area Problems](High%20Surface%20Area%20Problems.md). We can start by defining what it means to do good science in general. Then we can layer on the constraints applied while working in a startup environment. We can finish with a concrete example of a problem I am struggling with at ZGE today. ## 1. [Doing Good Science](Doing%20Good%20Science.md) * **Hypotheses** should be made **explicit** and **falsifiable** * **[Measurement Focused](Measuring%20the%20Right%20Thing.md)**: *What* are we *measuring*? *How* are we going to measure it? * Ask **two sided questions** where no matter what the result is, you learn something about the world. * Distinguish between **existential** and **universal** evidence ## 2. Startup Constraints * Constantly consider resource allocation and [Opportunity Cost](Opportunity%20Cost.md) * [Understanding is not always the deliverable](Understanding%20is%20the%20Deliverable.md). Sometimes we just need results and to stay alive. * We generally need to achieve [Step Function Improvements](Step%20Function%20Improvements.md) *if we are going to build out a complex system that takes an action*. * Systems that perform a *measurement* such as an experimental harness can be bolstered via [Stacking S Curves](Stacking%20S%20Curves.md) * [We want to move fast, iterate, try new things](Elon%20Musk.md) and have [Velocity over Predictability](Velocity%20over%20Predictability.md) ## 3. Doing Good Science at Startups So how can we ensure that we are doing good science at startups? **Hypotheses** * Hypotheses should be made explicit by writing it down and getting alignment from others. * Hypotheses should be made falsifiable by formally defining the conditions under which you would deem it falsified (this may be a practical limit) **Iteration Speed and a Good Compass** * In order to guarantee [fast iteration](Velocity%20over%20Predictability.md) you should balance seeking existential and universal evidence * Existential evidence should be achieved *as fast as possible*. It should be *time boxed*. If you can't achieve it quickly, ask why - if you can build tooling to make this faster in the future you should do so. Existential evidence can be used to prioritize where to spend more time and resources * Universal evidence will likely take more time and care to gather. Only do so when absolutely necessary. * Ask two sided questions where no matter the results you will have learned something useful to your problem that can provide direction for what to do next **Measurement** * Explicitly write down what you are measuring, how you are going to measure it, and how it relates to what you really care about * Measurement should be improved when possible via [Stacking S Curves](Stacking%20S%20Curves.md). This simply means adding an additional metric to your experimental setup when it comes up, making it better over time. **Things to Avoid** * Gathering a bunch of existential evidence for extended periods of time while not thinking about what it will take to move towards universal evidence. * Trying a bunch of non falsifiable experiments where you don't learn a ton. For example: when I tried graph neural networks or matrix profile, I wasn't able to rule out that they wouldn't work. The issue is that if you try too many of these things for too long, you may end up with nothing to show for it. ## 4. [An Example: Anomalies at ZGE](Anomalies%20at%20ZGE.md) ## 5. [Backtest Uncertainty](Backtest%20Uncertainty.md) --- Date: 20231105 Links to: Tags: References: * []()