699: The Modern Data Stack — with Harry Glaser

Topics covered
Popular Clips
Episode Highlights
Notebook Deployment
highlights the efficiency of deploying machine learning models from notebooks to production using Modelbit. This tool simplifies the transition from experimental models to robust production systems by automating the deployment process, which traditionally required full-time ML engineers. notes that data scientists often produce models faster than they can be engineered into production, creating a bottleneck in the deployment process 1.
The key insight is the data scientist is almost always in this Jupyter notebook building the model, and if we can just automate the process of taking it from there into production, we can hopefully save them a lot of time and stress.
---
Modelbit addresses this by integrating with Jupyter notebooks and other environments, allowing seamless deployment and reducing the need for extensive software engineering skills 2 3.
CI/CD Integration
Continuous integration and deployment (CI/CD) play a crucial role in the lifecycle of machine learning models. explains that Modelbit integrates with GitHub to automate testing and deployment processes, ensuring models are production-ready without manual intervention 4. This integration allows data scientists to focus on model development while maintaining robust version control and deployment workflows.
The fact that we're backed by your GitHub repo, you do a model bit deploy that triggers a git push, and if you've been using a branch and you merge that branch, it triggers your CI CD.
---
By leveraging Git for model management, Modelbit provides a clean and efficient workflow, enabling seamless updates and rollbacks without disrupting the experimental nature of notebooks 5.
Modern Stack
The modern data stack, as discussed by and , emphasizes the importance of integrating machine learning models directly into products. Modelbit facilitates this by allowing models to be deployed with minimal code, enhancing accessibility for teams with varying expertise levels 6. This approach is particularly beneficial in industries like fintech, where real-time model predictions are crucial for core business functions.
Data science and machine learning is just moving so fast right now. That's actually my favorite part of being in this space, is how fast it's moving.
---
By deploying models within existing products, companies can leverage their infrastructure to deliver immediate value, making Modelbit a preferred choice for many organizations 7.
Related Episodes


661: Designing Machine Learning Systems — with Chip Huyen
Answers 383 questions

669: Streaming, reactive, real-time machine learning — with Adrian Kosowski
Answers 383 questions

671: Cloud Machine Learning — with Kirill Eremenko and Hadelin de Ponteves
Answers 383 questions

SDS 619: Tools for Deploying Data Models into Production — with Erik Bernhardsson
Answers 383 questions

SDS 435: Scaling Up Machine Learning — with Erica Greene
Answers 383 questions

SDS 573: Automating ML Model Deployment — with Doris Xin
Answers 383 questions
658: How to Build Data and ML Products Users Love — with Brian T. O'Neill
Answers 383 questions

SDS 595: Data Engineering 101 — with Joe Reis and Matt Housley
Answers 383 questions

753: Blend Any Programming Languages in Your ML Workflows — with Dr. Greg Michaelson
Answers 383 questions

679: The A.I. and Machine Learning Landscape — with investor George Mathew
Answers 383 questions

649: Introduction to Machine Learning — with Kirill Eremenko and Hadelin de Ponteves
Answers 383 questions

647: Is Data Science Still Sexy? — with Tom Davenport
Answers 383 questions
SDS 468: The History of Data — with Jon Krohn
Answers 383 questions

732: Data Science for Astronomy — with Dr. Daniela Huppenkothen
Answers 383 questions

SDS 433: Data Science Trends for 2021 — with Ben Taylor
Answers 383 questions














