go back

Volume 18, No. 3

Cardinality Estimation for Similarity Search on High-Dimensional Data Objects: The Impact of Reference Objects

Authors:
Hai Lan, Shixun Huang, Zhifeng Bao, Renata Borovica-Gajic

Abstract

In this paper, we study the problem of cardinality estimation for similarity search on high-dimensional data ( CE4HD ). We aim to perform CE4HD with high data robustness (i.e., robust to di ff erent datasets), query robustness (i.e., robust to large cardinality variance and scale) and e ffi ciency. We propose to leverage the cardinality es-and scale) and e ffi ciency. We propose to leverage the cardinality estimation of selected objects (called reference objects) in the database to achieve the above. Speci fi cally, we propose two techniques that adopt di ff erent strategies to select and leverage reference objects, as well as strategies to support e ffi cient computation in dynamic databases. Extensive experiments on datasets from diverse domains show that our methods achieve up to ∼ 10x speed-up and up to ∼ 136x smaller mean Q-error compared to existing studies.

PVLDB is part of the VLDB Endowment Inc.

Privacy Policy