Vicki Boykis — Machine Learning Across Industries

Topics covered
Popular Clips
Episode Highlights
Production Hurdles
Vicki Boykis highlights the intricate challenges of deploying machine learning models into production. She notes that the process is more complex than traditional software deployment due to the need for data management, model drift planning, and service orchestration 1. Vicki explains that prototyping and solidifying steps are crucial, as packaging models often involves creating REST endpoints or using Docker containers 2.
Putting stuff in production is really hard. And so I would say that's the biggest thing.
---
These challenges underscore the importance of thorough planning and understanding of both data and model intricacies.
Metadata Management
Managing metadata is a significant yet often overlooked aspect of data science projects. Vicki emphasizes that many companies struggle with metadata management, which is crucial for updating models and conducting analyses 3. She mentions that while open-source tools like Amundsen are emerging, there is still no single solution for comprehensive metadata management.
People actually clamor for that, more so than even visibility into how to manage the model.
---
Without standardized metadata, companies face issues like not knowing which data is proprietary or how to efficiently query data lakes.
Team Structures
The structure of data teams significantly impacts their effectiveness. Vicki discusses the benefits of both centralized and embedded data science teams, noting that smaller companies may benefit from a centralized approach, while larger companies might find embedded teams more effective 4. She highlights the risk of siloed teams leading to duplicated efforts, emphasizing the importance of collaboration.
I've seen it work well different ways in different companies.
---
Ultimately, the choice between centralized and embedded teams depends on the company's size and specific needs.
Related Episodes


Zack Chase Lipton — The Medical Machine Learning Landscape
Answers 383 questions

Richard Socher — The Challenges of Making ML Work in the Real World
Answers 383 questions

Nicolas Koumchatzky — Machine Learning in Production for Self-Driving Cars
Answers 383 questions

Alyssa Simpson Rochwerger — Responsible ML in the Real World
Answers 383 questions

James Cham — Investing in the Intersection of Business and Technology
Answers 383 questions

Operationalizing Machine Learning: Interview with Shreya Shankar
Answers 383 questions

Johannes Otterbach — Unlocking ML for Traditional Companies
Answers 383 questions

Chip Huyen of Claypot AI— ML Research and Production Pipelines
Answers 383 questions

Angela & Danielle — Designing ML Models for Millions of Consumer Robots
Answers 383 questions

Luis Ceze — Accelerating Machine Learning Systems
Answers 383 questions

D. Sculley — Technical Debt, Trade-offs, and Kaggle
Answers 383 questions

Aaron Colak — ML and NLP in Experience Management
Answers 383 questions

Josh Tobin — Productionizing ML Models
Answers 383 questions

Chris, Shawn, and Lukas — The Weights & Biases Journey
Answers 383 questions

Anthony Goldbloom — How to Win Kaggle Competitions
Answers 383 questions












