Optimizing Learning Dynamics
Tom discusses the use of evolutionary strategies in learning optimizers and the importance of differentiating meta parameters. He highlights the scalability of self-tuning actor-critic methods and the challenges of evaluating off-policy hyperparameters in reinforcement learning algorithms.In this clip
From this podcast

Machine Learning Street Talk (MLST)
#49 - Meta-Gradients in RL - Dr. Tom Zahavy (DeepMind)
Related Questions