go back
go back
Volume 18, No. 2
Less is More: Efficient Time Series Dataset Condensation via Two-fold Modal Matching
Abstract
The expanding instrumentation of processes throughout society with sensors yields a proliferation of time series data that may in turn enable important applications, e.g., related to trans portation infrastructures or power grids. Machine-learning based methods are increasingly being used to extract value from such data. We provide means of reducing the resulting considerable computational and data storage costs. We achieve this by providing means of condensing large time series datasets such that models trained on the condensed data achieve performance comparable to those trained on the original, large data. Specifically, we propose a time series d ataset c ondensation framework, TimeDC, that employs two-fold modal matching, encompassing frequency matching and training trajectory matching. Thus, TimeDC performs time series feature extraction and decomposition-driven frequency matching to preserve complex temporal dependencies in the reduced time series. Further, TimeDC employs curriculum training trajectory matching to ensure effective and generalized time series dataset condensation. To avoid memory overflow and to reduce the cost of dataset condensation, the framework includes an expert buffer storing pre-computed expert trajectories. Extensive experiments on real data offer insight into the effectiveness and efficiency of the proposed solutions.
PVLDB is part of the VLDB Endowment Inc.
Privacy Policy