Evolving MLOps Platforms for Generative AI and Agents with Abhijit Bose - 714

Topics covered
Popular Clips
Questions from this episode
- Asked by 91 people
- Asked by 69 people
- Asked by 67 people
- Asked by 55 people
- Asked by 54 people
- Asked by 45 people
- Asked by 41 people
- Asked by 40 people
- Asked by 38 people
- Asked by 33 people
- Asked by 31 people
- Asked by 27 people
Episode Highlights
Kubernetes
The integration of Kubernetes has significantly enhanced AI platform development at Capital One. explains that their robust platform control plane, based on Kubernetes, allows for flexibility in incorporating various tools and services, including those from AWS and open-source communities 1. This flexibility has been crucial in extending their machine learning platform to support generative AI use cases, enabling rapid adaptation and innovation. highlights the complexity of data annotation in generative AI compared to traditional machine learning, emphasizing the need for refined capabilities and tools 2.
Observability
Enhancing observability tools is vital for managing the complexities of generative AI applications. notes that while traditional machine learning requires solid monitoring for model drift and input features, generative AI introduces new challenges like LLM hallucinations, necessitating advanced guardrails and logging systems 3. These enhancements ensure proper governance and execution of agentic workflows, making observability not just important but complex. emphasizes leveraging existing anomaly detection algorithms and extending them to handle new data types, ensuring comprehensive monitoring across platforms 4.
Inference Optimization
Optimizing inference efficiency is a critical focus at Capital One, with efforts to reduce costs and latency from the outset. shares that maintaining low cost per token and latency are key performance indicators, requiring continuous optimization of GPU utilization and other techniques 5. This involves leveraging both proprietary and open-source tools to enhance inference workflows, ensuring effective deployment of fine-tuned models. highlights the collaboration between science and engineering teams to integrate advanced techniques like quantization and speculative decoding into their inference systems 6.
Related Episodes


Feature Platforms for Data-Centric AI with Mike Del Balso - #577
Answers 383 questions

Machine Learning Platforms at Uber with Mike Del Balso - #115
Answers 383 questions

The Evolution of the NLP Landscape with Oren Etzioni - #598
Answers 383 questions

Deploying Edge and Embedded AI Systems with Heather Gorr - 655
Answers 383 questions

Generative AI on the Edge with Vinesh Sukumar - 623
Answers 383 questions

Compositional ML and the Future of Software Development with Dillon Erb - #520
Answers 383 questions

Jupyter and the Evolution of ML Tooling with Brian Granger - #544
Answers 383 questions

Interactive Machine Learning Systems with Alekh Agarwal - #17
Answers 383 questions

AutoML for Natural Language Processing with Abhishek Thakur - #475
Answers 383 questions

Evolving AI Systems Gracefully with Stefano Soatto - #502
Answers 383 questions

Scaling AI for the Enterprise with Mazin Gilbert - #78
Answers 383 questions

Live from TWIMLcon! Operationalizing ML at Scale with Hussein Mehanna - #306
Answers 383 questions

Scaling Deep Learning: Systems Challenges & More with Shubho Sengupta - #14
Answers 383 questions













