Revisiting Graph Neural Networks for Link Prediction

Conference on Neural Information Processing Systems (NeurIPS)


In this paper, we theoretically characterize graph neural network’s representation power for high-order node set prediction problems (where a prediction is made over a set of more than 1 node). In particular, we focus on one most important second-order task—link prediction. There are two representative classes of GNN methods for link prediction: GAE and SEAL. GAE (Graph Autoencoder) first applies a GNN to the whole graph, and then aggregates the representations of the source and target nodes as their link representation. SEAL extracts a subgraph around the source and target nodes, labels the nodes in the subgraph, and then uses a GNN to learn a link representation from the labeled subgraph. At first glance, both GAE and SEAL use a GNN. However, their performance gap can be very large. On the recent Open Graph Benchmark datasets, SEAL achieved 3 first places out of 4 datasets, outperforming the best GAE method by up to 195% in Hits@100. In this paper, by studying this performance gap between GAE and SEAL, we first point out a key limitation of GAE caused by directly aggregating two node representations as a link representation. To address this limitation, we propose the labeling trick. Labeling trick unifies several recent successes to improve GNNs’ representation power, such as SEAL, Distance Encoding, and Identity-aware GNN, into a single and most basic form. We prove that with labeling trick a sufficiently expressive GNN can learn the most expressive structural representations for node sets. Our work establishes a theoretical foundation for using GNNs for high-order node set prediction.

Featured Publications