Published Oct 1, 2024

AI Agents for Data Analysis with Shreya Shankar - 703

Explore the future of AI in data analysis with Shreya Shankar and Sam Charrington, as they delve into building agentic systems, innovative AI interface designs, DocETL for optimizing LLM data pipelines, and the critical need for specialized evaluation benchmarks for effective data processing.

Episode Highlights

Topics covered

Episode Highlights

Benchmark Needs

The need for specialized benchmarks in data processing with LLMs is evident due to the unique challenges these tasks present. highlights that current benchmarks focus on reasoning-based tasks, which differ significantly from data processing tasks that require maintaining context and making decisions throughout the process 1. She explains that data processing tasks often involve complex reasoning over large datasets, unlike the shorter, more straightforward tasks typically benchmarked in AI research 1.

Data processing requires its own set of benchmarks where the tasks, I think ideally it's not specific to a single LLM call.

---

Furthermore, Shreya notes the importance of flexibility in these benchmarks, allowing for different methods of data decomposition and orchestration of LLM calls 1.

Design Insights

Designing benchmarks for LLMs in data processing involves creating flexible evaluation frameworks that can adapt to various tasks. Shreya discusses the use of validation prompts and ranking algorithms to assess the effectiveness of different data processing plans 2. She emphasizes the need for a good validation prompt, which can significantly impact the accuracy and recall of the evaluation process 2.

Everything hinges on having a good validation prompt and a good kind of ranking algorithm here.

---

This approach allows for the synthesis of task-specific validation prompts, enabling more precise evaluations of LLM outputs in data processing contexts 2.

Related Episodes

Interactive Machine Learning Systems with Alekh Agarwal - #17
Answers 383 questions
AI for Network Management with Shirley Wu - 710
Answers 383 questions
AI for Content Creation with Debajyoti Ray - TWiML Talk #178
Answers 383 questions
AI Agents and Data Integration with GPT and LLaMa with Jerry Liu - 628
Answers 383 questions
AI Agents: Substance or Snake Oil with Arvind Narayanan - 704
Answers 383 questions
Generative AI on the Edge with Vinesh Sukumar - 623
Answers 383 questions
AutoML for Natural Language Processing with Abhishek Thakur - #475
Answers 383 questions
Deploying Edge and Embedded AI Systems with Heather Gorr - 655
Answers 383 questions
AI for High-Stakes Decision Making with Hima Lakkaraju - #387
Answers 383 questions
Engineering the Future of AI with Ruchir Puri - #21
Answers 383 questions
AI Engineering Pitfalls with Chip Huyen - 715
Answers 383 questions
Robotics at OpenAI with Jonas Schneider - #76
Answers 383 questions
Intelligent Infrastructure Management with Pankaj Goyal & Rochna Dhand - TWiML Talk #258
Answers 383 questions
Web Scale Engineering for Machine Learning with Sharath Rao - #40
Answers 383 questions
Understanding AI’s Impact on Social Disparities with Vinodkumar Prabhakaran - 617
Answers 383 questions

AI Agents for Data Analysis with Shreya Shankar - 703

Topics covered

Popular Clips

Episode Highlights

Building Agentic SystemsShreya Shankar, a PhD student at UC Berkeley, discusses the intricacies of agent orchestration and fault tolerance in complex data systems. She highlights the importance of structured logic and robust fault tolerance mechanisms in managing agentic systems effectively.

Building Agentic Systems

AI Interface DesignThe discussion shifts to the future of AI interfaces, exploring the challenges and potential advancements beyond current chat-based systems. Shreya Shankar and Sam Charrington delve into the need for new interaction paradigms and more intuitive user experiences.

AI Interface Design

DocETL Functionality

LLM Evaluation Benchmarks

Benchmark Needs

Design Insights

Related Episodes