Designing Production-Friendly Machine Learning

Authors:

Matei Zaharia (Stanford and Databricks)

Abstract

Building production ML applications is difficult because of their resource cost and complex failure modes. I’ll discuss these challenges from two perspectives: the Stanford DAWN lab and experience with large-scale commercial ML users at Databricks. I’ll then present two emerging ideas to help address these challenges. The first is “ML platforms”, an emerging class of software systems that standardize the interfaces used in ML applications to make them easier to build and maintain. I’ll give a few examples, including the open source MLflow system from Databricks. The second idea is models that are more “production-friendly” by design. As a concrete example, I will discuss retrieval-oriented NLP models such as Stanford's ColBERT that query documents from an updateable corpus to perform tasks such as question-answering, which gives multiple practical advantages, including low computational cost, high interpretability, and very fast updates to the model’s “knowledge”. These models are an exciting alternative to large language models such as GPT-3.

PVLDB is part of the VLDB Endowment Inc.

Start

Current Submission

All Volumes

Reproducibility

General Information

Volume 14, No. 13

Designing Production-Friendly Machine Learning

Abstract