go back

Volume 16, No. 12

CEDA: Learned Cardinality Estimation with Domain Adaptation

Authors:
Zilong Wang, Qixiong Zeng, Ning Wang, Haowen Lu, Yue Zhang

Abstract

Cardinality Estimation (CE) is a fundamental but critical problem in DBMS query optimization, while deep learning techniques have made significant breakthroughs in the research of CE. However, apart from requiring sufficiently large training data to cover all possible query regions for accurate estimation, current query-driven CE methods also suffer from workload drifts. In fact, retraining or fine-tuning needs cardinality labels as ground truth and obtaining the labels through DBMS is also expensive. Therefore, we propose CEDA, a novel domain-adaptive CE system. CEDA can achieve more accurate estimations by automatically generating workloads as training data according to the data distribution in the database, and incorporating histogram information into an attention-based cardinality estimator. To solve the problem of workload drifts in realworld environments, CEDA adopts a domain adaptation strategy, making the model more robust and perform well on an unlabeled workload with a large difference from the feature distribution of the training set.

PVLDB is part of the VLDB Endowment Inc.

Privacy Policy