go back
go back
Volume 18, No. 3
SDEcho: Efficient Explanation of Aggregated Sequence Difference
Abstract
Understanding the reasons behind differences between aggregated sequences derived from SQL queries is crucial for data scientists. However, existing methods often suffer from being labor-intensive, lacking scalability, providing only approximate solutions, and inad-lacking scalability, providing only approximate solutions, and inadequately supporting sequence difference explanations. In response, we introduce SDEcho, a novel framework designed to automate the explanation searching for sequence differences in high-dimensional and high-volume datasets. SDEcho utilizes advanced pruning tech-and high-volume datasets. SDEcho utilizes advanced pruning techniques, considering pattern, order, and dimension perspectives, as well as their interactions, to prune the entire explanation space while maintaining explanations accurate and concise. This hybrid pruning approach significantly accelerates the explanation search-pruning approach significantly accelerates the explanation searching process, making SDEcho a valuable tool for data analysis tasks. Extensive experiments on synthetic and real-world datasets, a long with a case study, demonstrate that SDEcho outperforms exis t ing methods in terms of both effectiveness and efficiency.
PVLDB is part of the VLDB Endowment Inc.
Privacy Policy