# Mutual Information Papers and Ideas ### My Thoughts I believe the following: * A user has a certain set of expectations about how they see the world * They use these expectations to navigate the world. These expectations are crucial to making little predictions as they attempt to do there job * They may have an expectation about how the world looks under some given context, _and that expectation is incorrect_, and there is an inherent _value_ associated with the mistake. * We want to _help_ them identify this incorrect expectation so they can error correct and increase value. * So, fundamentally it seems necessary to encode/capture our end users _expectations_ about the world in our AI. * At this point in time, the way in which we do that is via a simple _assumption_: the distribution of the KPI for a given pattern (combination of filters) will look like that of its parent. * This leaves an _immense_ amount of information related to expectation on the table. So, we can define the problem and objective as follows: > **Problem**: We need a less naive way to be encode/capture users *expectations* about the world. We then need way to see the reality that is occurring objectively based on the data. One thing I would like to demonstrate would be a few key, quick tasks that could be interesting: 1. Simpsons paradox finder 2. expectation reversal finder 3. natural experiment finder 4. Lever analyzer 5. Create action data structure to capture whether a sibling could realistically be moved to be more like its siblings 1. This is a key idea and thing to consider. This information is captured *all the time informally in conversation with customers.* ### Contributions of these papers **Paper 1** * Fantastic overview of redundancy and synergy present in attributes of data * Gives a concrete way in which we can calculate redundancy * Lattice diagrams are nice visualizations * Key terms: *synergy*, *redundancy* **Paper 2** * Explains the shortcomings of approach of paper 1 (specifically $I_{min}$) * Provides alternative measures (see [here](https://dit.readthedocs.io/en/stable/measures/pid.html) for associated python library) **Paper 3** * Provides a method to visualization interaction of attributes, however, it is based on interaction information which has proven to be non intuitive to deal with, due to the fact that it can be negative * Key terms: *positive interaction* (synergy), *negative interaction* (redundancy) * * **Paper 4** ### Challenges that I am seeing arise * How do you computationally (in a reasonable amount of time) create a network of expectations and interactions when your attributes number increases? * Probably want to find an attribute that is either 80% of the time informative or 80% of the time not informative (then, based on this, the remaining 20% leaves room for surprise or reversal of expectation. ) --- Date: 20210614 Links to: References: 1. Nonnegative Decomposition of Multivariate Information (paper, see pdf expert) 2. Information Decomposition of Target Effects from Multi-Source Interactions (paper) 3. Quantifying and Visualizing Attribute Interactions: An Approach Based on Entropy (paper) 4. A Rigorous Information Theoretic Definition of Redundancy and and Relevancy in Feature Selection Based on Partial Information Decomposition 5. Notability - Interaction information idea