Understand Systems via What They Are Not Doing

# Understand Systems via What They Are Not Doing Understanding what a complex system is doing is hard. How do you go about probing it in order to build an explanation of how it works? This problem arises both in the natural sciences, as well as in the understanding of artificial systems (see the papers [Linear Mode Connectivity and the Lottery Ticket Hypothesis](Linear%20Mode%20Connectivity%20and%20the%20Lottery%20Ticket%20Hypothesis.pdf) and [Understanding Transformers via N-gram Statistics](Understanding%20Transformers%20via%20N-gram%20Statistics.pdf)). One approach can be to use a **simple system** that you fully understand in order to understand the more complex system. This can work via two ways: 1. Show that the complex system is doing something *similar* to the simple system (e.g [Understanding Transformers via N-gram Statistics](Understanding%20Transformers%20via%20N-gram%20Statistics.pdf)) 2. Show that the complex system is doing something *in addition to* the simple system (e.g. [Cover](Cover.md)) Notice that in (2) we don't actually identify *how* or *why* the complex system is doing the additional thing on top of the simple system. However, that isn't the point of (2). We just want to identify *if it is doing something extra at all*. This is binary, yes or no. This drastically simplifies our analysis since comparing two systems is generally a much easier task than coming up with a fully fledged explanation of a complex system. Imagine that you spent a month trying to come up with an explanation of what the complex system was doing, only to then see that it performs the same as the [Simple Baselines](Simple%20Baselines.md). In other words, the simple baseline could have been your explanation all along. This approach allows for iterative learning. It forces you to isolate the attributes of a system that you are most interested in understanding. For instance, say I have a complex system $C$. Can likely think of $C$ as some combination of simpler systems, $S_i$, and something else that is unique to $C$, which I'll just call $C^*$: $C = S_1 + \dots + S_n + C^*$ By thinking of $C$ in these terms it forces us to think about what simple systems $S_i$ compose $C$, and just what exactly we wish to isolate and understand it's benefit, namely $C^*$. This is walked through in detail in [Cover](Cover.md). --- Date: 20240819 Links to: [Cover](Cover.md) Tags: References: * []()