Dexa/Super Data Science: ML & AI Podcast with Jon Krohn

AI Insights Unveiled

Nathan shares how DDPO enables intent alignment in smaller models, allowing them to outperform larger ones. He discusses the challenges of current humanoid robots and highlights the importance of RLHF in addressing biases in pre-trained models. This conversation is packed with valuable insights into the future of AI and robotics.

In this clip
From this podcast
Super Data Science: ML & AI Podcast with Jon Krohn
791: Reinforcement Learning from Human Feedback (RLHF) — with Dr. Nathan Lambert
Related Questions
- Tell me about the podcast Super Data Science: ML & AI Podcast with Jon Krohn
- Tell me about the podcast Super Data Science: ML & AI Podcast with Jon Krohn