go back

Volume 15, No. 11

Optimizing Differentially-Maintained Recursive Queries on Dynamic Graphs

Authors:
Khaled Ammar (University of Waterloo, Thomson Reuters Labs)* Siddhartha Sahu (University of Waterloo) Semih Salihoglu (University of Waterloo) Tamer Özsu (University of Waterloo)

Abstract

Differential computation (DC) is a highly general incremental computation/view maintenance technique that can maintain the output of an arbitrary and possibly recursive dataflow computation upon changes to the base inputs of the dataflow. As such, it is a promising technique for graph database management systems (GDBMS) that aim to support continuous recursive queries over dynamic graphs, such as single pair shortest paths, variable-length path, or regular path queries. Although differential computation can be highly efficient for maintaining these queries, it can require prohibitively large amount of memory, as its generality is based on keeping track of all input and output differences of the operators in the dataflow across all iterations. This paper studies how to reduce the memory overheads of DC with the goal of increasing the scalability of systems that adopt it. We propose a suite of optimizations that are based on dropping the differences of operators, both completely or partially, and recomputing these differences when necessary. Our optimizations that drop parts of the differences of an operator require data structures to keep track of the dropped differences, for which we offer solutions that use both deterministic and probabilistic data structures. We present extensive experiments evaluating the scalability and performance trade-offs of our optimizations and demonstrate that they can increase the scalability of a DC-based continuous query processor, implemented as an extension to the GraphflowDB GDBMS, by up to 20× while still providing better performance than rerunning the queries from scratch.

PVLDB is part of the VLDB Endowment Inc.

Privacy Policy