go back
go back
Volume 15, No. 1
Learned Cardinality Estimation: A Design Space Exploration and A Comparative Evaluation
Abstract
Cardinality estimation -- which predicts the result number of an SQL query -- is core to the query optimizers of database management systems. Non-learned methods, especially based on histograms and samplings, have been the predominant methods for decades and are widely used in commercial and open-source DBMSs. Nevertheless, histograms and samplings can only be used to summarize one or few columns, which fall short of capturing the joint data distribution over an arbitrary combination of columns, because of the oversimplification of histograms and samplings over the original relational table(s). Consequently, these traditional methods typically make bad predictions for hard cases such as queries over multiple columns, with multiple predicates, and joins between multiple tables. Recently, learned cardinality estimators have been widely studied. Because these learned estimators can better capture the data distribution and query characteristics, empowered by the recent advance of (deep learning) models, they outperform non-learned methods on many cases. The goals of this paper are to provide a design space exploration of learned cardinality estimators, and to have a comprehensive comparison of the state-of-the-art learned approaches so as to provide a guidance for practitioners to decide what method to use under various practical scenarios.
PVLDB is part of the VLDB Endowment Inc.
Privacy Policy