AI Insights Unveiled
Nathan shares how DDPO enables intent alignment in smaller models, allowing them to outperform larger ones. He discusses the challenges of current humanoid robots and highlights the importance of RLHF in addressing biases in pre-trained models. This conversation is packed with valuable insights into the future of AI and robotics.In this clip
From this podcast

Super Data Science: ML & AI Podcast with Jon Krohn
791: Reinforcement Learning from Human Feedback (RLHF) — with Dr. Nathan Lambert
Related Questions