This website is under development. If you come accross any issues, please report them to Konstantinos Kanellis (kkanellis@cs.wisc.edu) or Yannis Chronis (chronis@google.com).

Fixed It For You: Protocol Repair Using Lineage Graphs

Authors:
Lennart Oldenburg, Xiangfeng Zhu, Kamala Ramasubramanian, Peter Alvaro
Abstract

Distributed systems are difficult to program and near impossible to debug. Existing tools that focus on single-node computation are poorly-suited to diagnose errors that involve the interaction of many machines over time. The database notion of provenance would appear to be a better fit for answering the sort of cause-and-effect questions that arise during debugging, but existing provenance-based approaches target only a narrow set of debugging scenarios. In this paper, we explore the limits of provenance-based debugging. We propose a simple query language to express common debugging questions as expressions over provenance graphs capturing traces of distributed executions. When programs and their correctness properties are written in the same highlevel declarative language, we can go a step further than highlighting errors by often generating repairs for distributed programs. We validate our prototype debugger, Nemo, on six protocols from our taxonomy of 52 real-world distributed bugs, either generating repair rules or pointing the programmer to root causes.