Reinforcement Learning Variance

Tim and Lukas discuss the challenges of reproducibility in reinforcement learning due to high variance in results. Heinrich highlights the importance of multiple model initializations for reliable experiments. The team shares insights on Nethack's resilience against exploits despite ongoing development for over 30 years.