Templating Shuffles
Abstract
Cloud data centers are evolving fast. At the same time, today’s largescale data analytics applications require non-trivial performance tuning that is often specific to the applications, workloads, and data center infrastructure. We propose TeShu, which makes network shufling an extensible unified service layer common to all data analytics. Since an optimal shufle depends on a myriad of factors, TeShu introduces parameterized shufle templates, instantiated by accurate and efficient sampling that enables TeShu to dynamically adapt to different application workloads and data center layouts. Our preliminary experimental results show that TeShu efficiently enables shufling optimizations that improve performance and adapt to a variety of data center network scenarios.