go back

Volume 14, No. 3

Comprehensive and Efficient Workload Compression

Authors:
Shaleen Deep (University of Wisconsin-Madison), Anja Gruenheid (Microsoft), Paraschos Koutris (University of Wisconsin-Madison), Jeff Naughton (Google), Stratis Viglas (University of Edinburgh)

Abstract

This work studies the problem of constructing a representa- tive workload from a given input workload where the former serves as an approximation with guarantees of the latter. We discuss our work in the context of workload analysis and monitoring. As an example, evolving system usage patterns in a database system can cause load imbalance and perfor- mance regressions which can be controlled by monitoring system usage patterns, i.e., a representative workload, over time. In order to construct such a workload in a principled manner, we formalize the notion of workload representativ- ity and coverage. These metrics capture the intuition that the distribution of features in a compressed workload should match a target distribution, increasing representativity, and include common queries as well as outliers, increasing cov- erage. We show that solving this problem optimally is NP- Hard and present a novel greedy algorithm that provides approximation guarantees. We compare our techniques to established algorithms in this problem space such as sam- pling and clustering, and demonstrate advantages and key trade-offs of our techniques.

PVLDB is part of the VLDB Endowment Inc.

Privacy Policy