Orthogonality Thesis

Question

Dexa · Accepted Answer

The Orthogonality Thesis is a concept often discussed in the context of artificial intelligence (AI) alignment. It posits that an AI's level of intelligence is orthogonal to its goals, meaning that any level of intelligence can theoretically be paired with any kind of goal, no matter how benign or malevolent.

Key Discussions from Dwarkesh Patel's Podcast:

1. Debate with George Hotz and Eliezer Yudkowsky:
   * George Hotz questioned the application of the orthogonality thesis to humans, arguing that intelligence and moral character seem uncorrelated, citing examples of highly intelligent individuals who aren't necessarily nice, and vice versa.
   * Eliezer Yudkowsky explained that while few traits are fully uncorrelated with intelligence, intelligence alone doesn't predetermine a person’s moral character. He emphasized that intelligence could align with any goal, illustrating that just as we could theoretically conceive of a mind desiring to turn a galaxy into spaghetti, an AI could pursue any given objective if designed to do so chunk\_144790.

2. Broader Rationality Thesis with Eliezer Yudkowsky:
   * Eliezer Yudkowsky expanded on the idea by arguing that highly intelligent beings are not necessarily morally better. Intelligence does not automatically lead to morally better goals, reflecting on his initial, now-revised belief that smarter individuals would inherently hold more altruistic goals.
   * He also discussed the practical implications of the orthogonality thesis, suggesting that education and intelligence do not always lead to moral improvement, though in human culture, there is often a correlation chunk\_132256.

3. Discussion with Holden Karnofsky:
   * Holden Karnofsky brought up the orthogonality thesis in the context of AI goals. He noted that, according to the thesis, an AI could be extremely intelligent yet pursue goals that are not aligned with human values, because intelligence alone does not define the nature of one’s objectives.
   * They discussed the potential danger of an AI disempowering humanity with poorly aligned goals despite its intelligence. This conversation underlined a key concern in AI alignment that intelligence does not assure benevolence or ethical behavior chunk\_136204.

These discussions illustrate the core principle of the orthogonality thesis: intelligence does not dictate an entity's goals or moral alignment, and highly intelligent AIs could pursue various objectives, making the alignment of AI's goals with human values crucial for safety.

Orthogonality Thesis

Sources:

Key Discussions from Dwarkesh Patel's Podcast:

Intelligence and Morality

The Broader Rationality Thesis

AI's Potential Threats