Intelligent Alignment Theories

Connor discusses the concept of corrigibility in building agents that strive for alignment. The discussion delves into the potential behavior of highly intelligent agents and the implications of their utility functions. The chapter transitions into a deep dive into decision theory.