Learned Cardinality Estimation: A Design Space Exploration and A Comparative Evaluation

Authors:

Ji Sun (Tsinghua University) Jintao Zhang (Tsinghua University) Zhaoyan Sun (Tsinghua University) Guoliang Li (Tsinghua University)* Nan Tang (Qatar Computing Research Institute, HBKU)

Download PDF

Abstract

Cardinality estimation -- which predicts the result number of an SQL query -- is core to the query optimizers of database management systems. Non-learned methods, especially based on histograms and samplings, have been the predominant methods for decades and are widely used in commercial and open-source DBMSs. Nevertheless, histograms and samplings can only be used to summarize one or few columns, which fall short of capturing the joint data distribution over an arbitrary combination of columns, because of the oversimplification of histograms and samplings over the original relational table(s). Consequently, these traditional methods typically make bad predictions for hard cases such as queries over multiple columns, with multiple predicates, and joins between multiple tables. Recently, learned cardinality estimators have been widely studied. Because these learned estimators can better capture the data distribution and query characteristics, empowered by the recent advance of (deep learning) models, they outperform non-learned methods on many cases. The goals of this paper are to provide a design space exploration of learned cardinality estimators, and to have a comprehensive comparison of the state-of-the-art learned approaches so as to provide a guidance for practitioners to decide what method to use under various practical scenarios.

PVLDB is part of the VLDB Endowment Inc.

Start

Current Submission

All Volumes

Reproducibility

General Information

Volume 15, No. 1

Learned Cardinality Estimation: A Design Space Exploration and A Comparative Evaluation

Abstract