go back

Volume 14, No. 12

New Trends in High-D Vector Similarity Search: AI-driven, Progressive, and Distributed

Authors:
Karima Echihabi (Mohammed VI Polytechnic University), Themis Palpanas (University of Paris), Kostas Zoumpatianos (Snowflake Computing)

Abstract

Similarity search is a core operation of many critical data science applications, involving massive collections of high-dimensional (high-d) objects. Similarity search finds objects in a collection close to a given query according to some definition of sameness. Objects can be data series, text, multimedia, graphs, database tables or deep network embeddings. In this tutorial, we revisit the similarity search problem in light of the recent advances in the field and the new big data landscape. We discuss key data science applications that require efficient high-d similarity search, we survey recent approaches and share surprising insights about their strengths and weaknesses, and we discuss open research problems, including the directions of AI-driven, progressive, and distributed high-d similarity search.

PVLDB is part of the VLDB Endowment Inc.

Privacy Policy