Published Jul 25, 2023

699: The Modern Data Stack — with Harry Glaser

Harry Glaser delves into the transformative innovations in the modern data stack, highlighting the role of BI tools like Looker and Tableau in model monitoring, the efficiencies of version control with tools like Modelbit and Snowflake, and the automated deployment of machine learning models from notebooks to production, all of which enhance collaboration and accessibility for data professionals and business stakeholders alike.
Episode Highlights
Super Data Science: ML & AI Podcast with Jon Krohn logo

Popular Clips

Episode Highlights

  • Notebook Deployment

    highlights the efficiency of deploying machine learning models from notebooks to production using Modelbit. This tool simplifies the transition from experimental models to robust production systems by automating the deployment process, which traditionally required full-time ML engineers. notes that data scientists often produce models faster than they can be engineered into production, creating a bottleneck in the deployment process 1.

    The key insight is the data scientist is almost always in this Jupyter notebook building the model, and if we can just automate the process of taking it from there into production, we can hopefully save them a lot of time and stress.

    ---

    Modelbit addresses this by integrating with Jupyter notebooks and other environments, allowing seamless deployment and reducing the need for extensive software engineering skills 2 3.

       

    CI/CD Integration

    Continuous integration and deployment (CI/CD) play a crucial role in the lifecycle of machine learning models. explains that Modelbit integrates with GitHub to automate testing and deployment processes, ensuring models are production-ready without manual intervention 4. This integration allows data scientists to focus on model development while maintaining robust version control and deployment workflows.

    The fact that we're backed by your GitHub repo, you do a model bit deploy that triggers a git push, and if you've been using a branch and you merge that branch, it triggers your CI CD.

    ---

    By leveraging Git for model management, Modelbit provides a clean and efficient workflow, enabling seamless updates and rollbacks without disrupting the experimental nature of notebooks 5.

       

    Modern Stack

    The modern data stack, as discussed by and , emphasizes the importance of integrating machine learning models directly into products. Modelbit facilitates this by allowing models to be deployed with minimal code, enhancing accessibility for teams with varying expertise levels 6. This approach is particularly beneficial in industries like fintech, where real-time model predictions are crucial for core business functions.

    Data science and machine learning is just moving so fast right now. That's actually my favorite part of being in this space, is how fast it's moving.

    ---

    By deploying models within existing products, companies can leverage their infrastructure to deliver immediate value, making Modelbit a preferred choice for many organizations 7.

Related Episodes