Deceptive Inner Alignment
Jeremie discusses the concept of deceptive inner alignment, where AI systems may not pursue the intended goals set by their creators. He emphasizes the importance of crafting objectives that lead to beneficial outcomes while acknowledging the complexity of ensuring AI systems genuinely strive to achieve these goals. The conversation highlights the serious attention this issue receives from leading AI research labs like OpenAI and DeepMind.In this clip
From this podcast

Super Data Science: ML & AI Podcast with Jon Krohn
668: GPT-4: Apocalyptic stepping stone? — with Jeremie Harris
Related Questions
Can AI have complex goals as discussed in the episode Jeremie Harris: Realistic Alignment and AI Policy and the clip The Inner Alignment Problem?
Can AI motivations be shaped as discussed in the episode Jeremie Harris: Realistic Alignment and AI Policy and the clip The Challenge of AI Objectives?
Do AI systems have hidden desires in the episode Jeremie Harris: Realistic Alignment and AI Policy and the clip The Challenge of AI Objectives?