Want to create an interactive transcript for this episode?
Podcast: The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)
Episode: Scaling Model Training with Kubernetes at Stripe with Kelley Rivoire
Description: Today weβre joined by Kelley Rivoire, engineering manager working on machine learning infrastructure at Stripe. Kelley and I caught up at a recent Strata Data conference to discuss:
β’ Her talk "Scaling model training: From flexible training APIs to resource management with Kubernetes."
β’ Stripeβs machine learning infrastructure journey, including their start from a production focus.
β’ Internal tools used at Stripe, including Railyard, an API built to manage model training at scale & more!