Dexa
/
Gradient Dissent - A Machine Learning Podcast
Learn more
Follow
Vanilla Agent's Progress
Tim highlights how a basic vanilla agent can surprisingly make steady progress in Nethack, reaching an average score of around 750. The agent learns to adapt and improve, showcasing promising potential for future sophisticated models.
Add to Radar
Share
In this clip
Heinrich Kuttler
Tim Rocktäschel
Lukas Biewald
From this podcast
Gradient Dissent - A Machine Learning Podcast
Tim & Heinrich — Democraticizing Reinforcement Learning Research
Related Questions
What is the effect of random rewards during learning?
How are AI agents trained and evaluated?