Published Apr 16, 2023

#114 - Secrets of Deep Reinforcement Learning (Minqi Jiang)

Dive into the secrets of deep reinforcement learning with Minqi Jiang as he unravels the complexities of defining intelligence, the strategic use of minimax regret, and the dynamic balance of creativity and reliability in language models through Reinforcement Learning from Human Feedback.

Episode Highlights

Topics covered

Episode Highlights

Related Episodes

Can we build a generalist agent? Dr. Minqi Jiang and Dr. Marc Rigter
Answers 383 questions
#045 Microsoft's Platform for Reinforcement Learning (Bonsai)
Answers 383 questions
#49 - Meta-Gradients in RL - Dr. Tom Zahavy (DeepMind)
Answers 383 questions
Understanding Deep Learning - Prof. SIMON PRINCE [STAFF FAVOURITE]
Answers 383 questions
#046 The Great ML Stagnation (Mark Saroufim and Dr. Mathew Salvaris)
Answers 383 questions
#036 - Max Welling: Quantum, Manifolds & Symmetries in ML
Answers 383 questions
#53 Quantum Natural Language Processing - Prof. Bob Coecke (Oxford)
Answers 383 questions
Dr. Paul Lessard - Categorical/Structured Deep Learning
Answers 383 questions
#65 Prof. PEDRO DOMINGOS [Unplugged]
Answers 383 questions
#86 - Prof. YANN LECUN and Dr. RANDALL BALESTRIERO - SSL, Data Augmentation, Reward isn't enough [NEURIPS2022]
Answers 383 questions
#107 - Dr. RAPHAËL MILLIÈRE - Linguistics, Theory of Mind, Grounding
Answers 383 questions
#60 Geometric Deep Learning Blueprint (Special Edition)
Answers 383 questions
CURL: Contrastive Unsupervised Representations for Reinforcement Learning
Answers 383 questions
Prof. Chris Bishop's NEW Deep Learning Textbook!
Answers 383 questions
Robert Lange on NN Pruning and Collective Intelligence
Answers 383 questions

Dexa/Machine Learning Street Talk (MLST)

#114 - Secrets of Deep Reinforcement Learning (Minqi Jiang)

Topics covered

Popular Clips

Bias in Language Models

Language Model Consistency

Understanding MDP Differences

RLHF in Language Models

Cutting-Edge Robotics

Unsupervised Learning Potential

Intelligence and Learning

Uncertainty Estimators

Convergent Learning

Generative Models for Task Simulation

Emergent Behavior in Language Models

Reinforcement Learning Insights

Challenges in Simulated Environments

Embracing Exploration

The Power of Exploration

Episode Highlights

Intelligence Definitions

Reinforcement Learning Challenges

Language Model Dynamics

Related Episodes

Can we build a generalist agent? Dr. Minqi Jiang and Dr. Marc Rigter

#045 Microsoft's Platform for Reinforcement Learning (Bonsai)

#49 - Meta-Gradients in RL - Dr. Tom Zahavy (DeepMind)

Understanding Deep Learning - Prof. SIMON PRINCE [STAFF FAVOURITE]

#046 The Great ML Stagnation (Mark Saroufim and Dr. Mathew Salvaris)

#036 - Max Welling: Quantum, Manifolds & Symmetries in ML

#53 Quantum Natural Language Processing - Prof. Bob Coecke (Oxford)

Dr. Paul Lessard - Categorical/Structured Deep Learning

#65 Prof. PEDRO DOMINGOS [Unplugged]

#86 - Prof. YANN LECUN and Dr. RANDALL BALESTRIERO - SSL, Data Augmentation, Reward isn't enough [NEURIPS2022]

#107 - Dr. RAPHAËL MILLIÈRE - Linguistics, Theory of Mind, Grounding

#60 Geometric Deep Learning Blueprint (Special Edition)

CURL: Contrastive Unsupervised Representations for Reinforcement Learning

Prof. Chris Bishop's NEW Deep Learning Textbook!

Robert Lange on NN Pruning and Collective Intelligence