Cerebro: A Layered Data Platform for Scalable Deep Learning
Abstract
Deep learning (DL) is gaining popularity across many domains thanks to tools such as TensorFlow and easier access to GPUs. But building large-scale DL applications is still too resource-intensive and painful for all but the big tech firms. A key reason for this pain is the expensive model selection process needed to get DL to work well. Existing DL systems treat this process as an afterthought, leading to massive resource wastage and a usability mess. To tackle these issues, we present our vision of a first-of-its-kind data platform for scalable DL, Cerebro, inspired by lessons from the database world. We elevate the DL model selection process with higherlevel APIs already inherent in practice and devise a series of novel multi-query optimization techniques to substantially raise resource efficiency. This vision paper presents our system design philosophy and architecture, our recent research and open research questions, initial results, and a discussion of tangible paths to practical impact.