Mach: A Pluggable Metrics Storage Engine for the Age of Observability
Abstract
Observability is gaining traction as a key capability for understanding the internal behavior of large-scale system deployments. Instrumenting these systems to report quantitative telemetry data called metrics enables engineers to monitor and maintain services that operate at an enormous scale so they can respond rapidly to any issues that might arise. To be useful, metrics must be ingested, stored, and queryable in real time, but many existing solutions cannot keep up with the sheer volume of generated data. This paper describes Mach, a pluggable storage engine we are building specifically to handle high-volume metrics data. Similar to many popular libraries (e.g., Berkeley DB, LevelDB, RocksDB, WiredTiger), Mach provides a simple API to store and retrieve data. Mach’s lean, loosely coordinated architecture aggressively leverages the characteristics of metrics data and observability workloads, yielding an order-of-magnitude improvement over existing approaches—especially those marketed as “time series database systems” (TSDBs). In fact, our preliminary results show that Mach can achieve nearly 10× higher write throughput and 3× higher read throughput compared to several widely used alternatives.