go back
go back
Volume 17, No. 12
Complex-Path: Effective and Efficient Node Ranking with Paths in Billion-Scale Heterogeneous Graphs
Abstract
Node ranking in heterogeneous graphs, which quantifies the relative importance of nodes, can often be improved by incorporating information from relevant paths. Graph database and heterogeneous graph neural network (HGNN) are two main approaches to better solve this problem. Graph databases support efficient path queries for flexible path types but require manual design to combine results for node ranking. Conversely, current HGNNs can automatically integrate semantic information from multiple linear path types for accurate node ranking. However, our experiments show that they fail to outperform a multi-layer perceptron model that utilizes features extracted from multiple nonlinear conditional paths, which can be handled by graph databases. Therefore, we aim to enable HGNN to take advantage of these path types for better performance. However, HGNNs require a generalized path schema to define the structure of input paths, and incorporating each additional path type will significantly increase the required system memory and sampling time for HGNNs. To address these limitations, we introduce CompNode, a novel framework based on a new unified path schema definition called Complex-path, which is used to describe all the required path types, including nonlinear conditional path types. Then, we design a pre-aggregation method to reduce the required system memory and sampling time by pre-aggregating the same type of complex-path. Furthermore, we develop a model that combines semantic information from all aggregated complex-paths for accurate node ranking. Real-world experiments on identifying top potential high-value customers show CompNode outperforms state-of-the-art HGNNs by 20% in average precision and the previously deployed graph database method by 252% in success rate.
PVLDB is part of the VLDB Endowment Inc.
Privacy Policy