# Graph Neural Networks Favorite Resources: 1. [Intro to graph neural networks (ML Tech Talks) - YouTube](https://www.youtube.com/watch?v=8owQBFAHw7E) 2. [Let's build GPT: from scratch, in code, spelled out. - YouTube](https://www.youtube.com/watch?v=kCc8FmEb1nY) 3. [Graph Attention Networks](https://petar-v.com/GAT/) 4. [Graph Convolutional Networks (GCNs) made simple - YouTube](https://www.youtube.com/watch?v=2KRAOZIULzw&t=201s) 1. GCNs can be thought of as *feature smoothing* in a neighborhood 5. [Understanding Graph Attention Networks - YouTube](https://www.youtube.com/watch?v=A-yKQamf2Fc&t=1s) 6. [Michael Bronstein – Medium](https://michael-bronstein.medium.com/) 7. [Transformers are Graph Neural Networks](https://thegradient.pub/transformers-are-graph-neural-networks/) 8. [Beyond Message Passing: a Physics-Inspired Paradigm for Graph Neural Networks](https://thegradient.pub/graph-neural-networks-beyond-message-passing-and-weisfeiler-lehman/) ### Attention In a graph nn, the benefit over a simple attention mechanism is that in a simple attention mechanism, a node has to consider attending to all other nodes. But a graph nn allows us to say “no only focus on those that you have an edge with”. Simply using the graph is great, but think about the flaw. Say nodes 1-9 are all around a specific constraint, and thus share an edge. We could have a gnn simply update the node features of all 9 nodes based on each other. But what is node 2 is only really influenced by node 7? We would like to be able to learn that! That is where graph attention comes in! We can think of this as a WEIGHT for each edge going to its neighbors. These weights of course sum to 1. Only focusing on neighbor nodes is a form of masked attention Transformer - edges grow quadratically with number of nodes in fully connected graph ### Types of architecture Graph neural networks take as input a graph endowed with node and edge features and compute a function that depends both on the features and the graph structure. Message-passing type GNNs, also called Message Passing Neural Networks (MPNN) [3], propagate node features by exchanging information between adjacent nodes. A typical MPNN architecture has several propagation layers, where each node is updated based on the aggregation of its neighbors’ features. Aggregation functions are typically parametric, and come in three different “flavors” of graph neural networks: * **Convolutional**: Linear combination of neighbor features where weights depend only on the structure of the graph. * **Attentional**: Linear combination, where weights are computed based on the features. * **Message Passing**: A general nonlinear function dependent on the features of two nodes sharing an edge. --- Date: 20230207 Links to: Tags: References: * []()