In this survey, I will present a vibrant and exciting area of deep learning research: graph representation learning. Or, put simply, building machine learning models over data that lives on graphs (interconnected structures of nodes connected by edges). These models are commonly known as graph neural networks, or GNNs for short.There is very good reason to study data on graphs. From the molecule (a graph of atoms connected by chemical bonds) all the way to the connectomic structure of the brain (a graph of neurons connected by synapses), graphs are a universal language for describing living organisms, at all levels of organisation. Similarly, most relevant artificial constructs of interest to humans, from the transportation network (a graph of intersections connected by roads) to the social network (a graph of users connected by friendship links), are best reasoned about in terms of graphs.This potential has been realised in recent years by both scientific and industrial groups, with GNNs now being used to discover novel potent antibiotics (Stokes et al., 2020), serve estimated travel times in Google Maps (Derrow-Pinion et al., 2021), power content recommendations in Pinterest (Ying et al., 2018) and product recommendations in Amazon (Hao et al., 2020), and design the latest generation of machine learning hardware: the TPUv5 (Mirhoseini et al., 2021). Further, GNNbased systems have helped mathematicians uncover the hidden structure of mathematical objects (Davies et al., 2021), leading to new top-tier conjectures in the area of representation theory (Blundell et al., 2021). It would not be an understatement to say that billions of people are coming into contact with predictions of a GNN, on a day-to-day basis. As such, it is likely a valuable pursuit to study GNNs, even without aiming to directly contribute to their development.Beyond this, it is likely that the very cognition processes driving our reasoning and decision-making are, in some sense, graph-structured. That is, paraphrasing a quote from Forrester (1971), nobody really imagines in their head all the information known to them; rather, they imagine only selected concepts, and relationships between them, and use those to represent the real system. If we subscribe to this interpretation of cognition, it is quite unlikely that we will be able to build a generally intelligent system without some component relying on graph representation learning. Note that this finding does not clash with the fact that many recent skillful ML systems are based on the Transformer architecture (Vaswani et al., 2017)-as we will uncover in this review, Transformers are themselves a special case of GNNs.