Reasoning in Models

Neel discusses the intriguing behavior of models when distinguishing between known and unknown entities, particularly in the context of movies. He highlights how models can exhibit different responses based on their knowledge, suggesting a form of self-awareness regarding their limitations. Tim adds to the conversation by exploring the implications of reasoning and externalizing thought processes, raising questions about how these mechanisms impact the model's ability to reconcile and articulate its knowledge.

In this clip
From this podcast
Machine Learning Street Talk (MLST)
Neel Nanda - Mechanistic Interpretability (Sparse Autoencoders)
Related Questions
- Can neural networks be made to reason as discussed in the episode Ilya Sutskever: Deep Learning | Lex Fridman Podcast #94 and the clip Neural Networks and Reasoning

Reasoning in Models

In this clip

From this podcast

Machine Learning Street Talk (MLST)

Neel Nanda - Mechanistic Interpretability (Sparse Autoencoders)

Related Questions

Can neural networks be made to reason as discussed in the episode Ilya Sutskever: Deep Learning | Lex Fridman Podcast #94 and the clip Neural Networks and Reasoning