Published Mar 23, 2021

#49 - Meta-Gradients in RL - Dr. Tom Zahavy (DeepMind)

Dr. Tom Zahavy from DeepMind delves into the complexities of reinforcement learning, examining the transformative potential of meta-gradients in enhancing AI adaptability and addressing non-stationary challenges, while also reflecting on human-like creativity and intrinsic motivation in AI systems.

Episode Highlights

Topics covered

Episode Highlights

Exploration

highlights the complexities of reinforcement learning (RL), emphasizing the balance between exploration and exploitation. He notes that while RL has potential, it faces challenges like non-stable training and sample inefficiency. shares his experience at DeepMind, where access to resources allowed for deeper understanding and communication of RL methods to the community 1.

Reinforcement learning works. Every problem that I try to solve, eventually I managed to get learning, or I managed to do something in it.

Despite these challenges, he believes RL can solve complex problems, though it requires expertise and resources 2.

Non-stationarity

Addressing non-stationarity in RL environments, discusses how meta gradients can stabilize learning. He explains that RL extends supervised learning by incorporating exploration and credit assignment, allowing for more complex problem-solving 3. Meta gradients, he argues, enable agents to adapt dynamically to changing environments, enhancing their ability to solve non-stationary problems.

Reinforcement learning is basically building on everything that was done in the bandit literature and later in the theoretical community on mdps.

This approach, he suggests, can lead to more efficient learning and better performance in diverse environments 4.

Related Episodes

#114 - Secrets of Deep Reinforcement Learning (Minqi Jiang)
Answers 383 questions
#65 Prof. PEDRO DOMINGOS [Unplugged]
Answers 383 questions
#046 The Great ML Stagnation (Mark Saroufim and Dr. Mathew Salvaris)
Answers 383 questions
#71 - ZAK JOST (Graph Neural Networks + Geometric DL) [UNPLUGGED]
Answers 383 questions
#045 Microsoft's Platform for Reinforcement Learning (Bonsai)
Answers 383 questions
#60 Geometric Deep Learning Blueprint (Special Edition)
Answers 383 questions
Can we build a generalist agent? Dr. Minqi Jiang and Dr. Marc Rigter
Answers 383 questions
Understanding Deep Learning - Prof. SIMON PRINCE [STAFF FAVOURITE]
Answers 383 questions
#102 - Prof. MICHAEL LEVIN, Prof. IRINA RISH - Emergence, Intelligence, Transhumanism
Answers 383 questions
#85 Dr. Petar Veličković (Deepmind) - Categories, Graphs, Reasoning [NEURIPS22 UNPLUGGED]
Answers 383 questions
#036 - Max Welling: Quantum, Manifolds & Symmetries in ML
Answers 383 questions
WelcomeAIOverlords (Zak Jost)
Answers 383 questions
ICLR 2020: Yoshua Bengio and the Nature of Consciousness
Answers 383 questions
#69 DR. THOMAS LUX - Interpolation of Sparse High-Dimensional Data
Answers 383 questions
#037 - Tour De Bayesian with Connor Tann
Answers 383 questions

#49 - Meta-Gradients in RL - Dr. Tom Zahavy (DeepMind)

Topics covered

Popular Clips

Episode Highlights

Reinforcement Learning Challenges

Exploration

Non-stationarity

Meta Gradients

AI and Human Intelligence

Related Episodes