KÙZU Graph Database Management System
Abstract
Datasets and workloads of popular applications that use graph database management systems (GDBMSs) require a set of storage and query processing features that RDBMSs do not traditionally optimize for. These include optimizations for: (i) many-to-many (m-n) joins; (ii) cyclic joins; (iii) recursive joins; (iv) semi-structured data storage; and (v) support for universal resource identifiers. We present Kùzu, a new GDBMS we are developing at University of Waterloo that aims to integrate state-of-art storage, indexing, and query processing techniques to highly optimize for this feature set. This paper serves the dual role of describing our vision for Kùzu and the system’s factorized query processor, which is based on two design goals: (i) achieving good factorization structures under m-n joins; and (ii) ensuring sequential scans that avoid entire scans of columns and join indices when possible. As we show these two goals can sometimes conflict and we describe our core binary and worst-case optimal (multiway) join operators that simultaneously achieve both goals. Kùzu is actively being developed to be a fully functional open-source DBMS with the goal of wide user adoption.