Information - Nate's Notes

# Information ### Adami Article ([What is Information](What%20is%20Information.pdf)) **Information** That which allows you (who is in possession of that information) to make predictions with better accuracy than chance (from this perspective information is equivalent to *knowledge* fundamentally tied to *prediction*) **Entropy** Is used to quantify how much isn't known. It is "potential information". It quantifies how much you could *possibly* know, but it isn't what you *actually* know. *Uncertainty* is use interchangeably with entropy. **Information**[^1] The difference between entropies (e.g. before and after measurement). Information is real. Entropy is in the eye of the beholder (TODO: make sure you follow this - think about DD criteria for reality here) Information and entropy are two fundamentally different objects. **Information theory is a theory of the relative state of measurement devices**. Physical systems[^2] do not possess an intrinsic entropy. Entropy is a property that is mathematically defined for a [Random Variable](Random%20Variable.md). But physical systems aren’t mathematical, they are messy complicated things. They become mathematical when *observed* through a *measurement device* that has a *finite resolution*. Their entropy is defined by the measurement device used to examine them with. A measurement device could be almost anything: * Your eyes when looking at a six sided die (six resolvable states) * A microscope looking at a bacteria sample (~1 million resolvable states) * A mercury thermometer (~200 resolvable states) But the general structure will always be as follows. We start with a physical system of interest. Then chose some way to observe it and measure which state it is in. The measurement device will implicitly define the states that the system can be in. If a different measurement device had been chosen, the system would have had a different set of resolvable states. Who choses the measuring device? *You do*. You determine the *states* of a physical object that you wish to measure, and chose a measurement device that can resolve those states. Entropy is defined by the number of states that can be resolved by the measurement that you are going to use to determine the state of the physical object. From this it is clear that entropy is an *anthropomorphic concept*, not only in the well-known statistical sense that it measures the extent of human ignorance as to the microstate. Even at the purely phenomenological level, entropy is an anthropomorphic concept. For it is a property, not of the physical system, but of the particular experiments you or I choose to perform on it. **Information and Entropy depend on implicit context** When asked to calculate the entropy of a mathematical random variable (as opposed to a physical object), you are generally given a bunch of information that you didn't realize that you have. Such as the number of possible states to expect, what those states are, and possibly even what the likelihood is of experiencing those states. But given those, your prime objective is to predict the state of the random variable as accurately as you can. And the more information you have, the better your prediction will be. Entropy quantifies how much you don’t know about something (a random variable). But in order to quantify how much you don’t know, you have to know something about the thing you don’t know. These are the hidden assumptions in probability theory and information theory. These are the things you didn’t know you knew. (TODO: shorten this section) **All probability distributions that aren't uniform are conditional probability distributions** Probability distributions are born being uniform. In that case, you know nothing about the variable, except perhaps the number of states it can take on. Because if you didn’t know that, then you wouldn’t even know how much you don’t know. That would be the “unknown unknowns”, that a certain political figure once injected into the national discourse. These probability distributions become non-uniform (that is, some states are more likely than others) once you acquire information about the states. This information is manifested by conditional probabilities. You really only know that a state is more or less likely than the random expectation if you at the same time know something else (like in the case discussed, whether the driver is texting or not). Put in another way, what I’m trying to tell you here is that any probability distribution that is not uniform (same probability for all states) is necessarily conditional. When someone hands you such a probability distribution, you may not know what it is conditional about. But I assure you that it is conditional. Shannon’s entropy is written in terms of “p log p”, but these “p” are really conditional probabilities if you know that they are not uniform, that is, all p the same for all states. They are not uniform given what else you know. * We can then define information as: $\text{Information} = \overbrace{\text{What you don't know}}^{H_{max}} - \overbrace{\text{What remains to be known given what you know}}^{H} $ * Information is "unconditional entropy minus conditional entropy". When cast as a relationship between two rvs $X$ and $Y$, this can be written conditionally as: $I(X : Y) = H(X) - H(X \mid Y)$ * We can reason about the above equation as follows: Entropy is "potential information". It quantifies "how much you could *possibly* know". But it is not what you *actually* know. We can think of $H(X)$ as "how much could be known about $Xquot;, and $H(X|Y)$ as "how much could be known about $X$, given we know what $Y$ is". Now imagine $Y$ provides no information about $X$ - then we would have $H(X) = H(X|Y)$, and hence $I(X : Y) = 0$. On the other hand, imagine that $Y$ provides all that you need to know about $X$, hence $H(X|Y) = 0$, meaning that if you know $Y$ the entropy of $X$ (what there is to know about $X$) is $0$. We see then that the information between $X$ and $Y$ is $I(X : Y) = H(X)$. But this doesn't mean that information equals entropy always! Just that, in this scenario, the potential information of $X$, namely the entropy $H(X)$, is entirely known given $Y$! * Leaving off pg 15/16 ### Beginning on Infinity * Human brains and DNA molecules each have many functions, but among other things they are general-purpose information-storage media: they are in principle capable of storing any kind of information. Moreover, the two types of information that they respectively evolved to store have a property of cosmic significance in common: once they are physically embodied in a suitable environment, they tend to cause themselves to remain so. Such information – which I call knowledge– is very unlikely to come into existence other than through the error- correcting processes of evolution or thought. * Knowledge is information which, when it is physically embodied in a suitable environment, tends to cause itself to remain so. * Pg 142, 186 * ### From Fabric of Reality * The complexity of a piece of information is defined in terms of the computational resources (such as the length of the program, the number of computational steps or the amount of memory) that a computer would need if it was to reproduce that piece of information. * In the twentieth century information was added to this list when the invention of computers allowed complex information processing to be performed outside human brains * But with the benefit of hindsight we can now see that even the classical theory of computation did not fully conform to classical physics, and contained strong adumbrations of quantum theory. It is no coincidence that the word bit, meaning the smallest possible amount of information that a computer can manipulate, means essentially the same as quantum, a discrete chunk * This information is located in the physical state of my brain * Information can only be processed in one way: by [Computation](Computation.md) of the kind invented by Babbage and Turing. ### [Constructor Theory of Information](Constructor%20Theory%20of%20Information.pdf) TODO: Go through constructor theory of information ### Notes and thoughts An Information transfer process would be hard because information of any physical entity is objective based on how you measure it. makes this a huge space --- Date: 20240807 Links to: Tags: References: * []() [^1]: Information (or lack there of) is not defined in terms of the number of unknown states. Rather it is the logarithm of unknown states. This is to ensure additivity. [^2]: A physical system could be arbitrarily complex, but at its core it consists of objects. A physical system could be one die, five dice, two dice and a deck of playing cards, the dashboard of a car, the car itself, and so on.