go back
go back
Volume 14, No. 13
TSCache: An Efficient Flash-based Caching Scheme for Time-series Data Workloads
Abstract
Time-series databases are becoming an indispensable component in today's data centers. In order to manage the rapidly growing time-series data, we need an efficient system solution to handle the huge traffic of time-series data queries at a high speed. An efficient approach is to deploy a large-capacity cache system to relieve the burden on the congested backend databases and accelerate the query processing. However, designing an efficient caching scheme for time-series data is non-trivial. Time-series data is drastically different from traditional file or object-based data. Its unique properties bring both opportunities and critical challenges. In this paper, we present a flash-based cache system for time-series data, called TSCache. By fully exploiting the unique properties of time-series data, we have developed a set of optimization schemes, such as a customized slab management, a two-layered data indexing structure, an adaptive time-aware caching policy, and an optimized low-cost compaction scheme to enhance the caching efficiency. We have implemented a fully functional prototype based on Twitter's Fatcache. Our experimental results based on five real-world time-series datasets show that TSCache can significantly improve the client query performance. With only 10% cache of the total dataset size, our time-series cache system can improve the hit ratio up to 92.1%, increase the system bandwidth by a factor of up to 5.2, and reduce the latency by up to 79%.
PVLDB is part of the VLDB Endowment Inc.
Privacy Policy