TMLS2021 Workshop: Towards Observability for Machine Learning Pipelines

Actions Panel

TMLS2021 Workshop: Towards Observability for Machine Learning Pipelines

Towards Observability for Machine Learning Pipelines

When and where

Date and time



About this event

Speaker: Shreya Shankar, Ph.D. Student, UC Berkeley


Software organizations are increasingly incorporating machine learning (ML) into their product offerings, driving a need for new data management tools. Many of these tools facilitate the initial development and deployment of ML applications, contributing to a crowded landscape of disconnected solutions targeted at different stages, or components, of the ML lifecycle. A lack of end-to-end ML pipeline visibility makes it hard to address any issues that may arise after a production deployment, such as unexpected output values or lower-quality predictions. In this talk, we propose a system that wraps around existing tools in the ML development stack and offers end-to-end observability. We introduce our prototype and our vision for mltrace, a platform-agnostic system that provides observability to ML practitioners by (1) executing predefined tests and monitoring ML-specific metrics at component runtime, (2) tracking end-to-end data flow, and (3) allowing users to ask arbitrary post-hoc questions about pipeline health.

What You'll Learn:

This talk/workshop is designed for ML practitioners interested in maintaining ML pipelines in production. We will not be discussing model development, data analysis, or feature engineering techniques used to build ML prototypes.

In this talk/workshop, you will learn about:

  • Types of bugs that occur in production ML pipelines
  • How to incorporate testing into your ML application codebase in a sustainable way
  • How to enable on-call ML engineers or data scientists to debug broken pipelines with ML-specific logging, monitoring, and querying


Shreya Shankar is a computer scientist living in the Bay Area and building systems to operationalize machine learning (ML) workflows. Her research focuses on end-to-end observability for ML systems, particularly in the context of heterogeneous stacks of tools. Currently, she is taking her Ph.D. in the RISE lab at UC Berkeley. Previously, she was the first ML engineer at Viaduct, did research at Google Brain, and obtained her BS and MS in computer science from Stanford.

Sales Ended