Efficient Training of Graph Neural Networks on Large Graphs

Authors:

Yanyan Shen, Lei Chen, Jingzhi Fang, Xin Zhang, Shihong Gao, Hongbo Yin

Abstract

Graph Neural Networks (GNNs) have gained significant popularity for learning representations of graph-structured data. Mainstream GNNs employ the message passing scheme that iteratively propagates information between connected nodes through edges. However, this scheme incurs high training costs, hindering the applicability of GNNs on large graphs. Recently, the database community has extensively researched effective solutions to facilitate efficient GNN training on massive graphs. In this tutorial, we provide a comprehensive overview of the GNN training process based on the graph data lifecycle, covering graph preprocessing, batch generation, data transfer, and model training stages. We discuss recent data management efforts aiming at accelerating individual stages or improving the overall training efficiency. Recognizing the distinct training issues associated with static and dynamic graphs, we first focus on efficient GNN training on static graphs, followed by an exploration of training GNNs on dynamic graphs. Finally, we suggest some potential research directions in this area. We believe this tutorial is valuable for researchers and practitioners to understand the bottleneck of GNN training and the advanced data management techniques to accelerate the training of different GNNs on massive graphs in diverse hardware settings.

PVLDB is part of the VLDB Endowment Inc.

Start

Current Submission

All Volumes

Reproducibility

General Information

Volume 17, No. 12

Efficient Training of Graph Neural Networks on Large Graphs

Abstract