All projects Graph Neural Networks

Graph Attention Networks

Learning how much each neighbor matters.

Study & implementation 2023 AttentionMessage passingRelational learning

Motivation

Graph Attention Networks replace fixed aggregation weights with learned attention over a node's neighborhood, letting the model decide which relationships carry information — without needing the full graph structure up front.

Method

For each edge an attention coefficient is computed from the endpoints' features, normalized over the neighborhood, and used to weight the message. Multi-head attention stabilizes learning and captures diverse relational patterns.

\[ \alpha_{ij} = \frac{\exp\!\big(\mathrm{LeakyReLU}(a^\top[Wh_i \,\|\, Wh_j])\big)}{\sum_{k\in\mathcal{N}(i)} \exp\!\big(\mathrm{LeakyReLU}(a^\top[Wh_i \,\|\, Wh_k])\big)} \]

Mathematical core

GAT is self-attention restricted to a graph's edges — the same operator behind transformers, specialized to non-Euclidean, relational domains.

Veličković et al. (2017)

← PreviousSummarization with T5 Next →Graph Convolutional Networks