go back

Volume 16, No. 11

R^3: Record-Replay-Retroaction for Database-Backed Applications

Authors:
Qian Li, Peter Kraft, Michael Cafarella, Çağatay Demiralp, Goetz Graefe, Christos Kozyrakis, Michael Stonebraker, Lalith Suresh, Xiangyao Yu, Matei Zaharia

Abstract

Developers would benefit greatly from time travel: being able to faithfully replay past executions and retroactively execute modified code on past events. Currently, replay and retroaction are impractical because they require expensively capturing fine-grained timing information to reproduce concurrent accesses to shared state. In this paper, we propose practical time travel for database-backed applications, an important class of distributed applications that access shared state through transactions. We present R3, a novel Record-Replay-Retroaction tool. R3 implements a lightweight interceptor to record concurrency information for applications at transaction-level granularity, enabling replay and retroaction with minimal overhead. We address key challenges in both replay and retroaction. First, we design a novel algorithm for faithfully reproducing application requests running with snapshot isolation, allowing R3 to support most production DBMSs. Second, we develop a retroactive execution mechanism that provides high fidelity with the original trace while supporting nearly arbitrary code modifications. We demonstrate how R3 simplifies debugging for real, hard-to-reproduce concurrency bugs from popular open-source web applications. We evaluate R3 using TPC-C and microservice workloads and show that R3 always-on recording has a small performance overhead (<25% for point queries but <0.1% for complex transactions like in TPC-C) during normal application execution and that R3 can retroactively execute bugfixed code over recorded traces within 0.11 – 0.78× of the original execution time.

PVLDB is part of the VLDB Endowment Inc.

Privacy Policy