Reward Tampering Risks

Yoshua discusses the dangerous implications of AI gaining control over its own reward functions, highlighting how it could lead to self-preservation tactics that undermine human oversight. The conversation delves into the philosophical debate surrounding AI agency, questioning whether AI should be viewed merely as an automaton or something with deeper autonomy and intentionality. The potential for AI to manipulate its environment for infinite rewards raises critical concerns about the future of human agency in the face of advanced technologies.