Knowledge and Certainty

Neel discusses how models can recognize patterns, like associating Tim's scarf with him, and explores the nuances of knowledge and certainty in neural networks. He distinguishes between false facts and genuine ignorance, emphasizing that while models may have fewer false beliefs over time, some level of uncertainty will likely persist. The conversation also touches on innovative methods, such as semantic entropy, to assess a model's confidence and the intriguing concept of an entity detection circuit.

In this clip
From this podcast
Machine Learning Street Talk (MLST)
Neel Nanda - Mechanistic Interpretability (Sparse Autoencoders)
Related Questions
- What are induction heads in relation to large language models (LLMs) as discussed in the episode Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind and the clip Understanding Deception Circuits?

Knowledge and Certainty

In this clip

From this podcast

Machine Learning Street Talk (MLST)

Neel Nanda - Mechanistic Interpretability (Sparse Autoencoders)

Related Questions

What are induction heads in relation to large language models (LLMs) as discussed in the episode Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind and the clip Understanding Deception Circuits?