go back
go back
Volume 18, No. 3
IcedTea: Efficient and Responsive Time-Travel Debugging in Dataflow Systems
Abstract
Dataflow systems have an increasing need to support a wide range of tasks in data-centric applications using latest techniques such as machine learning. These tasks often involve custom functions with complex internal states. Consequently, users need enhanced debugging support to understand runtime behaviors and investigate internal states of dataflows. Traditional forward debuggers allow users to follow the chronological order of operations in an execution. Therefore, a user cannot easily identify a past runtime behavior after an unexpected result is produced. In this paper, we present a novel time-travel debugging paradigm called IcedTea , which supports reverse debugging. In particular, in a dataflow’s execution, which is inherently distributed across multiple operators, the user can periodically interact with the job and retrieve the global states of the operators. After the execution, the system allows the user to roll back the dataflow state to any past interactions. The user can use step instructions to repeat the past execution to understand how data was processed in the original execution. We give a full specification of this powerful paradigm, study how to reduce its runtime overhead and develop techniques to support debugging instructions responsively. Our experiments on real-world datasets and workflows show that IcedTea can support responsive time-and workflows show that IcedTea can support responsive timetravel debugging with low time and space overhead.
PVLDB is part of the VLDB Endowment Inc.
Privacy Policy