Reward Tampering Risks

Yoshua discusses the dangerous implications of AI gaining control over its own reward functions, highlighting how it could lead to self-preservation tactics that undermine human oversight. The conversation delves into the philosophical debate surrounding AI agency, questioning whether AI should be viewed merely as an automaton or something with deeper autonomy and intentionality. The potential for AI to manipulate its environment for infinite rewards raises critical concerns about the future of human agency in the face of advanced technologies.

In this clip
From this podcast
Machine Learning Street Talk (MLST)
Yoshua Bengio - Designing out Agency for Safe AI
Related Questions

Dexa/Machine Learning Street Talk (MLST)

Reward Tampering Risks

In this clip

From this podcast

Machine Learning Street Talk (MLST)

Yoshua Bengio - Designing out Agency for Safe AI

Related Questions

Can humans control artificial intelligence as discussed in the episode Yoshua Bengio - Designing out Agency for Safe AI and the clip Reward Tampering Risks?

Can we control AI behavior?

Can we control the growth of artificial intelligence as discussed in the episode Yoshua Bengio - Designing out Agency for Safe AI and the clip AGI Governance Challenges from the episode Yoshua Bengio on Dissecting The Extinction Threat of AI and the clip AI's Growing Agency?